I am trying to run linear discriminant analysis in R. My dataframe contains two groups of data with dimension of 102 and 24. I ran R-code as follows:
mydata<-read.table()
head(mydata)
Factor TL SL FL HL HH EHH BH BW CL CH FNL DFH AFL AFH PFL
1 1 86.0 68.4 77.5 15.4 14.1 9.4 21.3 4.7 14.2 9.8 6.8 13.0 10.2 10.2 1.7
2 1 71.8 57.4 65.1 14.3 12.1 8.2 16.3 4.1 9.1 6.5 5.5 10.4 8.9 7.8 1.1
3 1 82.9 64.3 72.8 15.3 13.1 8.3 19.1 4.7 10.7 9.5 7.7 12.4 10.9 8.1 1.6
4 1 74.2 56.5 55.7 14.3 11.8 7.2 18.7 5.2 7.5 5.7 5.6 11.8 9.4 7.8 1.2
5 1 66.8 52.1 61.1 13.1 10.9 7.9 15.5 5.5 7.2 5.4 4.2 10.1 6.5 5.5 1.1
6 1 72.6 58.9 61.7 13.5 12.4 8.2 18.3 6.1 9.7 7.6 6.8 10.4 5.6 8.9 1.2
PFH ABFL ABFH Sin_P Posh_P B_P B_M B_M_B
1 13.7 1.8 9.4 16.3 34.6 39.6 48.1 29.1
2 9.4 1.2 6.3 9.4 30.5 32.8 38.4 23.8
3 12.2 1.7 9.1 16.4 34.6 39.5 44.8 30.1
4 11.1 1.3 5.7 14.3 31.6 29.1 41.1 23.2
5 9.2 1.1 6.8 14.8 30.2 29.1 36.3 23.4
6 9.8 1.9 8.5 15.4 30.9 32.9 41.9 25.1
library(MASS)
ord <- lda(Factor ~ ., mydata)
ord
Call:
lda(Factor ~ ., data = mydata)
Prior probabilities of groups:
1 2
0.5 0.5
Group means:
TL SL FL HL HH EHH BH BW
1 73.29020 57.99412 64.90392 14.15686 13.33137 8.347059 16.41373 5.821569
2 76.44118 61.42745 68.01569 14.48627 12.54510 8.227451 16.15294 7.586275
CL CH FNL DFH AFL AFH PFL PFH
1 8.427451 6.449020 6.070588 11.70980 8.611765 8.233333 1.360784 10.92157
2 8.752941 6.619608 6.954902 12.99412 8.821569 9.013725 2.754902 11.37255
ABFL ABFH Sin_P Posh_P B_P B_M B_M_B
1 1.482353 7.982353 14.78235 32.70196 32.94314 39.09235 23.77157
2 1.698039 8.639216 15.40196 33.13725 33.78431 40.99020 24.82745
Coefficients of linear discriminants:
LD1
TL -0.158877362
SL 0.085504033
FL -0.001151154
HL 0.001549496
HH -0.006513463
EHH -0.457378984
BH -0.071013364
BW 0.682076101
CL 0.124730256
CH 0.064695108
FNL 0.059726102
DFH 0.193330210
AFL -0.121504298
AFH 0.126553648
PFL 0.092334665
PFH 0.162660412
ABFL 0.041923390
ABFH -0.168389200
Sin_P -0.071962994
Posh_P -0.093672821
B_P 0.082480896
B_M 0.030929099
B_M_B 0.037913734
but when I try to plot the output I get this error:
library(ggord)
ggord(ord, mydata$Factor)
Error in predict(ord_in)$x[, axes] : subscript out of bounds
I found that the problem is that I have juts LD1
in the output and LD2
is not available.
Can anyone kindly solve this?
By this link you can find mydata:
https://www.dropbox.com/preview/Foruhar/morph.txt
LDA produces min(n,c-1) discriminants (c is the number of classes, n is the number of features). So with two classes you get only LD1. ggord needs 2 dimensions so it does not work. Try to make a histogram/density plot colored by class. Your data link is not valid (works only for you). Here's an example on generated data:
With the data one would do something like this
and continue as above