I would like to employ a spatial regression model, using the spreg package in Python. My data consists of numeric variables, but I also have a categorical land cover variable (with 7 classes) that I need to include in the model. This works perfectly fine using statsmodels, but I haven't been able to figure out how to do this in spreg.
I have tried creating dummy variables manually (using pd.get_dummies(data['land_cover'])
), but this results in an error message for my spreg.OLS model:
RuntimeWarning: invalid value encountered in sqrt se_result =np.sqrt(variance)
RuntimeWarning: invalid value encountered in sqrt tStat = betas[list(range(0, len(vm)))].reshape(len(vm),) / np.sqrt(variance)
All the dummy variables also have nan values in the Std.Error, t-Statistic and Probability sections of the results (see excerpt below).
Variable Coefficient Std.Error t-Statistic Probability
CONSTANT -142.9375000 nan nan nan
temperature 0.0136240 0.0001169 116.4984154 0.0000000
precipitation 0.0000003 0.0000000 153.7448775 0.0000000
cover_1 141.9375000 nan nan nan
cover_2 142.0625000 nan nan nan
cover_3 141.6875000 nan nan nan
cover_4 142.0625000 nan nan nan
cover_5 141.9375000 nan nan nan
cover_6 141.6875000 nan nan nan
cover_7 141.8125000 nan nan nan
Using statsmodels with the same data/variables, the output of the OLS model was this:
coef std err t P>|t|
temperature -0.0004 2.72e-05 -15.115 0.000
precipitation -1.62e-08 4.12e-10 -39.294 0.000
cover_1 0.0706 0.001 119.653 0.000
cover_2 0.0290 0.001 29.431 0.000
cover_3 0.0100 0.001 7.120 0.000
cover_4 0.0491 0.000 122.972 0.000
cover_5 0.0327 0.000 79.698 0.000
cover_6 0.0140 0.000 35.541 0.000
cover_7 -0.0026 0.001 -4.223 0.000
How can I include my categorical data into the spreg models (e.g spreg.GM_Lag)?
My guess is that you ran into the "dummy variable trap".
You don't have a constant in the statsmodels version, but it is included in the spreg version.
If you don't drop a reference level in your categorical variable, then it will be perfectly collinear with the constant. The design matrix will be singular and the standard product matrix x'x is not invertible.