I am trying to fit my data with the stats model OLS method. While following from a tutorial, imported all the necessary libraries;
from scipy import stats
import statsmodels.formula.api as sm
import numpy
import pandas
import matplotlib.pyplot as plt
import statsmodels.api as sm
Then defined all the variable names from the X_train data;
variable_names = [
'Block',
'Acreage',
'dist_Kyanuuna_TC',
'dist_Busunju_TC',
'dist_Namungo_TC',
'dist_Kitalya_TC',
'dist_Kabindula_TC',
'dist_Namayumba_HC',
'dist_BlueStarJr_Sch',
'dist_Kyanuuna_HS',
'dist_Busunju_Col',
'Central_P',
'years',
'Use_Agric_Farm',
'Use_Res',
'Use_Res_Agric']
Then included the neighborhood variable to the formular, for which binary dummy varaibles are to be created and fitted without an intercept.
f = 'Value ~ ' + ' + '.join(variable_names) + ' + neighborhood - 1'
And finally fitted the data as below;
model2 = sm.OLS(f, data=X_train).fit()
print(m3.summary2())
However, this raises the;
ValueError: unrecognized data structures: <class 'str'> / <class 'NoneType'>
But I have failed to figure out what could be the issue. Any clues on how to approach this would be very much appreciated. Thank you.
As written in the document here, it is
ols
instead ofOLS
Update: In your import section, you use both
sm
for two different packages. Removing the first one should work.