In sklearn.LinearDiscriminantAnalysis, I want to extract the best features that distinguish 3 classes of data, using n_components=2 linear discriminant axes (I have 2151 features/wavelengths within my spectral data)
My lda.coef_ returns, however, an ndarray with shape (3,2151) instead of the expected (2,2151) - since I wanted the coefficients that correspond to each discriminant (of which there are 2, not 3). What is the problem with my expectation, and why are there 3 sets of 2151 coefficients rather than 2?
'reflectance' is a pd.DataFrame with shape (800,2151), indiciating 800 samples of 2151 wavelengths. These 800 samples are from 3 labelled classes, 0, 1, and 2, which are contained in 'int_labels'.
CODE:
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(reflectance,int_labels,test_size=0.3)
lda = LinearDiscriminantAnalysis(n_components=2)
x_train_lda = lda.fit(X_train, y_train).transform(X_train)
coefficients=lda.coef_
coefficients.shape returns (3,2151) rather than (2,2151)