How to fit Multinomial logistic regression model with multilevel independent variables in stata or r

101 views Asked by At

I have travel data consisting of a person selecting a specific mode of transport (e.g., Car) over other options (e.g., Bus, Plane, and Train) and paying a certain price for the distance he travelled. Each row represents a trip that person made in their chosen mode. This data in wide format looks like this:

* Define variables and add data input str15 Mode Mode_id Price Distance Car 1 4.5 109 Train 2 2.1 34 Bus 3 3.3 14 Plain 4 8 150 Car 1 5 20 end * Display the dataset list

I have joined another data set of socio-economic variables at the district level, and the updated data looks like this:

* Define the dataset input str15 District District_id str10 Mode Mode_id Price Distance Income Praha 1 1 "Car" 1 4.5 109 200 Praha 1 1 "Train" 2 2.1 34 200 Praha 2 2 "Bus" 3 3.3 14 300 Praha 1 1 "Plain" 4 8 150 200 Praha 2 2 "Car" 1 5 20 300 end * Save the dataset save "Trips_Districts.dta", replace

Now I want to predict people's mode of choice preference with alternative specific independent variables "price" and "distance" and with another set of independent variables "income" and "education" that are nested at the first level, which is district.

The reshaped data in long format looks like this:

* Define variable names and data input str10 District str10 District_id Person_id str10 Mode Choice Price Distance Income "Praha 1" 1 1 "Car" 1 4.5 109 200 "Praha 1" 1 1 "Train" 0 . 200 "Praha 1" 1 1 "Bus" 0 . 200 "Praha 1" 1 1 "Plane" 0 . 200 "Praha 1" 1 2 "Train" 1 2.1 34 200 "Praha 1" 1 2 "Car" 0 . 200 "Praha 1" 1 2 "Bus" 0 . 200 "Praha 1" 1 2 "Plane" 0 . 200 "Praha 2" 2 3 "Bus" 1 3.3 14 300 "Praha 2" 2 3 "Plane" 0 . 300 "Praha 2" 2 3 "Car" 0 . 300 "Praha 2" 2 3 "Train" 0 . 300 "Praha1" 1 4 "Plane" 1 8 150 200 "Praha1" 1 4 "Car" 0 . 200 "Praha1" 1 4 "Train" 0 . 200 "Praha1" 1 4 "Bus" 0 . 200 "Praha 2" 1 5 "Car" 1 20 21 300 "Praha 2" 1 5 "Train" 0 . 300 "Praha 2" 1 5 "Bus" 0 . 300 "Praha 2" 1 5 "Plane" 0 . 300 end * Save the dataset save "Trips_Districts_Final.dta", replace

Problem1 : I don't have price and distance information for modes other than the one chosen.

Any model recommendation and data code to reshape my wide data into a long format in stata or r will be highly appreciated.

​​​​​​​Thank you in advance.

1

There are 1 answers

0
Pablo Bernabeu On

Regarding reshaping to long format, you recently provided the solution here.

Regarding the analysis, I cannot see enough data here to create any statistical model. I think you could only create some plots (for instance, using ggplot2).