I have made an extremely simple logistic model for the purpose of my question. Here is the code below;
import patsy
import pandas as pd
import statsmodels.api as sm
df = pd.DataFrame()
for i in range(5):
df.at[i, 'response'] = 1
if i == 3:
df.at[i,'response'] = 0
df['x'] = range(5)
y, X = patsy.dmatrices('response ~ x', df,return_type = 'dataframe')
logit_model=sm.Logit(y,X)
result=logit_model.fit()
ypred = logit_model.predict(X)
print(ypred)
Please excuse my crappy code, i'm writing this in a rush - need to go to work haha. This code is throwing up a value error - ValueError: shapes (5,2) and (5,2) not aligned: 2 (dim 1) != 5 (dim 0) about line 18.
I genuinely don't understand how these are not aligned as I am simply passing through the train data X back into the model using predict(). My feeling is that I am missing something about patsy.dmatrices.
Anyone have an idea?
You assigned the fitted object to
result
, so you should use that to predict:To get the fitted values, you can also do: