mlogit: missing value where TRUE/FALSE needed

3.6k views Asked by At

I have data from a discrete choice experiment (DCE), looking at hiring preferences for individuals from different sectors. that I've formatted into long format. I want to model using mlogit. I have exported the data and can successfully run the model in Stata using the asclogit command, but I'm having trouble getting it to run in R.

Here's a snapshot of the first 25 rows of data:

> data[1:25,]
   userid    chid item sector outcome cul fit ind led prj rel
1   11275  211275    2      1       1   0   1   0   1   1   1
2   11275  211275    2      2       0   1   0   0   0   0   0
3   11275  211275    2      0       0   0   0   1   1   0   1
4   11275  311275    3      0       1   1   1   0   0   0   1
5   11275  311275    3      2       0   0   1   0   0   0   1
6   11275  311275    3      1       0   0   1   0   0   0   0
7   11275  411275    4      0       0   1   0   1   1   0   0
8   11275  411275    4      2       1   0   1   1   1   1   0
9   11275  411275    4      1       0   0   1   0   1   0   0
10  11275  511275    5      1       1   1   0   1   0   1   1
11  11275  511275    5      2       0   0   0   1   1   0   0
12  11275  511275    5      0       0   0   0   1   1   1   0
13  11275  611275    6      0       0   0   1   1   0   0   1
14  11275  611275    6      1       1   1   1   1   0   0   1
15  11275  611275    6      2       0   1   1   1   0   1   0
16  11275  711275    7      1       0   0   0   0   0   1   0
17  11275  711275    7      0       0   1   0   0   1   1   0
18  11275  711275    7      2       1   1   0   0   1   1   1
19  11275  811275    8      0       1   0   1   0   0   1   1
20  11275  811275    8      1       0   1   0   1   1   1   1
21  11275  811275    8      2       0   0   0   0   0   1   1
22  11275  911275    9      0       0   1   1   0   0   1   0
23  11275  911275    9      2       1   1   1   1   1   0   1
24  11275  911275    9      1       0   1   0   1   1   0   0
25  11275 1011275   10      0       0   0   0   0   0   0   0

userid and chid are factor variables, the rest are numeric. The variables: Userid is unique respondent ID chid is unique choice set ID per respondent item is choice set ID (they are repeated across respondents) sector is alternatives (3 different sectors) outcome is alternative selected by respondent in the given choice set cul-rel is binary factor variables, alternative specific that vary across alternatives according to the experimental design.

Here is my mlogit syntax:

mlogit(outcome~cul+fit+ind+led+prj+rel,shape="long",
       data=data,id.var=userid,chid.var="chid",
       choice=outcome,alt.var="sector")

Here is the error I get:

Error in if (abs(x - oldx) < ftol) { : 
  missing value where TRUE/FALSE needed

I've made sure there are no missing data, and that each choice set has exactly 1 selected alternative. Any ideas about why I'm getting this error, when the model runs fine in Stata with the exact same dataset? I've probably misread the mlogit syntax somewhere. If it helps, my Stata syntax is: asclogit outcome cul fit rel ind fit led prj, case(chid) alternatives(sector)

2

There are 2 answers

1
Daniel Morgan On

You may need to use mlogit.data() to shape the data. There's an examples at ?mlogit. Hope that helps.

1
robin.datadrivers On

Answering my own question here as I figured it out.

R mlogit can't handle when none of the alternatives in a choice set is selected. R also needs the data ordered properly, each alternative in a choice set must be in a row. I hadn't done that due to some data management. Interestingly, Stata can handle both of these conditions, so that's why my Stata commands worked.

As an aside, for those interested, Stata's asclogit and R's mlogit give the exact same results. Always nice when that happens.