I have a data frame with several thousand rows and only a few columns. I have pasted a portion of my code below. I am looking at the "Section" preferences of individuals and want to know what might be influencing their choice (e.g., density, area). I successfully formatted the data into long format using mlogit.data but I'm having trouble executing the mlogit function in R which is the final step.
Below is my data frame
> df[1:30,]
Date Indiv Sections Choices Density Area
1 8/1/2017 1 A Yes 0.13 21.7
2 8/1/2017 1 B No 0.29 12.2
3 8/1/2017 1 C No 0.23 7.5
4 8/1/2017 1 D No 0.05 3.7
5 8/1/2017 1 E No 0.31 29.3
6 8/1/2017 2 A No 0.13 21.7
7 8/1/2017 2 B No 0.29 12.2
8 8/1/2017 2 C Yes 0.23 7.5
9 8/1/2017 2 D No 0.05 3.7
10 8/1/2017 2 E No 0.31 29.3
11 8/1/2017 3 A No 0.13 21.7
12 8/1/2017 3 B Yes 0.29 12.2
13 8/1/2017 3 C No 0.23 7.5
14 8/1/2017 3 D No 0.05 3.7
15 8/1/2017 3 E No 0.31 29.3
16 8/2/2017 1 A No 0.19 21.7
17 8/2/2017 1 B No 0.27 12.2
18 8/2/2017 1 C Yes 0.43 7.5
19 8/2/2017 1 D No 0.11 3.7
20 8/2/2017 1 E No 0.47 29.3
21 8/2/2017 2 A No 0.19 21.7
22 8/2/2017 2 B No 0.27 12.2
23 8/2/2017 2 C No 0.43 7.5
24 8/2/2017 2 D No 0.11 3.7
25 8/2/2017 2 E Yes 0.47 29.3
26 8/2/2017 2 A No 0.19 21.7
27 8/2/2017 3 B No 0.27 12.2
28 8/2/2017 3 C No 0.43 7.5
29 8/2/2017 3 D No 0.11 3.7
30 8/2/2017 3 E Yes 0.47 29.3
After I run the mlogit.data function with the raw data I get this:
Date Indiv Sections Choices Density Area
A 8/1/2017 1 A TRUE 0.13 21.7
B 8/1/2017 1 B FALSE 0.29 12.2
C 8/1/2017 1 C FALSE 0.23 7.5
D 8/1/2017 1 D FALSE 0.05 3.7
E 8/1/2017 1 E FALSE 0.31 29.3
A 8/1/2017 2 A FALSE 0.13 21.7
B 8/1/2017 2 B FALSE 0.29 12.2
C 8/1/2017 2 C TRUE 0.23 7.5
D 8/1/2017 2 D FALSE 0.05 3.7
E 8/1/2017 2 E FALSE 0.31 29.3
A 8/1/2017 3 A FALSE 0.13 21.7
B 8/1/2017 3 B TRUE 0.29 12.2
C 8/1/2017 3 C FALSE 0.23 7.5
D 8/1/2017 3 D TRUE 0.05 3.7
E 8/1/2017 3 E FALSE 0.31 29.3
A 8/2/2017 1 A FALSE 0.19 21.7
B 8/2/2017 1 B FALSE 0.27 12.2
C 8/2/2017 1 C TRUE 0.43 7.5
D 8/2/2017 1 D FALSE 0.11 3.7
E 8/2/2017 1 E FALSE 0.47 29.3
A 8/2/2017 2 A FALSE 0.19 21.7
B 8/2/2017 2 B FALSE 0.27 12.2
C 8/2/2017 2 C FALSE 0.43 7.5
D 8/2/2017 2 D FALSE 0.11 3.7
E 8/2/2017 2 E TRUE 0.47 29.3
A 8/2/2017 3 A FALSE 0.19 21.7
B 8/2/2017 3 B FALSE 0.27 12.2
C 8/2/2017 3 C FALSE 0.43 7.5
D 8/2/2017 3 D FALSE 0.11 3.7
E 8/2/2017 3 E TRUE 0.47 29.3
Below is my mlogit syntax in R:
ML <- mlogit(Choice ~ Density + Area, data = df, method="nr")
Below is the error message:
Error in solve.default(H, g[!fixed]) :
Lapack routine dgesv: system is exactly singular: U[5,5] = 0
I've spent several hours and days modifying the code and researching the issue, but still can't make it run. I would very like to know what I am doing incorrectly and get some guidance on how to make the mlogit function work with my data.
Thank you very much for your help on this.
From what I think your data are, I had this problem too.
It seems to me that density and area are both alternative-specific variables that do not vary across individuals (although they do vary by time within alternative). So I think you need alternative-specific generic coefficients. But, if you do not have any alternative-specific variables that DO vary across individuals, you don't have enough varying terms to model with intercepts. SO, run your model without intercepts:
... and hopefully it should work.
See the short discussion when I asked about this for the clues about not-enough-terms and the vignette quote: R: Can I analyze non-varying-across-individual alternative-specific attribute variables with mlogit?