Why does lda.coef_ output a different number of coefficient sets to the number of discriminant components?

33 views Asked by At

In sklearn.LinearDiscriminantAnalysis, I want to extract the best features that distinguish 3 classes of data, using n_components=2 linear discriminant axes (I have 2151 features/wavelengths within my spectral data)

My lda.coef_ returns, however, an ndarray with shape (3,2151) instead of the expected (2,2151) - since I wanted the coefficients that correspond to each discriminant (of which there are 2, not 3). What is the problem with my expectation, and why are there 3 sets of 2151 coefficients rather than 2?

'reflectance' is a pd.DataFrame with shape (800,2151), indiciating 800 samples of 2151 wavelengths. These 800 samples are from 3 labelled classes, 0, 1, and 2, which are contained in 'int_labels'.

CODE:

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(reflectance,int_labels,test_size=0.3)

lda = LinearDiscriminantAnalysis(n_components=2)
x_train_lda = lda.fit(X_train, y_train).transform(X_train)

coefficients=lda.coef_

coefficients.shape returns (3,2151) rather than (2,2151)

0

There are 0 answers