Value Error when fitting data with statsmodel OLS method

1k views Asked by At

I am trying to fit my data with the stats model OLS method. While following from a tutorial, imported all the necessary libraries;

from scipy import stats
import statsmodels.formula.api as sm
import numpy
import pandas
import matplotlib.pyplot as plt
import statsmodels.api as sm

Then defined all the variable names from the X_train data;

variable_names = [
 'Block', 
 'Acreage', 
'dist_Kyanuuna_TC', 
'dist_Busunju_TC', 
'dist_Namungo_TC', 
'dist_Kitalya_TC',
'dist_Kabindula_TC', 
'dist_Namayumba_HC', 
'dist_BlueStarJr_Sch', 
'dist_Kyanuuna_HS',
'dist_Busunju_Col',
'Central_P',
'years',
'Use_Agric_Farm',
'Use_Res',
'Use_Res_Agric']

Then included the neighborhood variable to the formular, for which binary dummy varaibles are to be created and fitted without an intercept.

f = 'Value ~ ' + ' + '.join(variable_names) + ' + neighborhood - 1'

And finally fitted the data as below;

model2 = sm.OLS(f, data=X_train).fit()
print(m3.summary2())

However, this raises the;

ValueError: unrecognized data structures: <class 'str'> / <class 'NoneType'>

But I have failed to figure out what could be the issue. Any clues on how to approach this would be very much appreciated. Thank you.

1

There are 1 answers

3
R. Marolahy On

As written in the document here, it is ols instead of OLS

Update: In your import section, you use both sm for two different packages. Removing the first one should work.