Python: How to evaluate the residuals in StatsModels?

64k views Asked by At

I want to evaluate the residuals: (y-hat y).

I know how to do that:

df = pd.read_csv('myFile', delim_whitespace = True, header = None)
df.columns = ['column1', 'column2']
y, X = ps.dmatrices('column1 ~ column2',data = df, return_type = 'dataframe')
model = sm.OLS(y,X)
results = model.fit()
predictedValues = results.predict()
#print predictedValues
yData = df.as_matrix(columns = ['column1'])
res = yData - predictedValues

I wonder if there is a Method to do this (?).

3

There are 3 answers

1
TomAugspurger On BEST ANSWER

That's stored in the resid attribute of the Results class

Likewise there's a results.fittedvalues method, so you don't need the results.predict().

0
SciPy On

Normality of the residuals

Option 1: Jarque-Bera test

name = ['Jarque-Bera', 'Chi^2 two-tail prob.', 'Skew', 'Kurtosis']
test = sms.jarque_bera(results.resid)
lzip(name, test)

Out:

[('Jarque-Bera', 3.3936080248431666),
 ('Chi^2 two-tail prob.', 0.1832683123166337),
 ('Skew', -0.48658034311223375),
 ('Kurtosis', 3.003417757881633)]
Omni test:

Option 2: Omni test

name = ['Chi^2', 'Two-tail probability']
test = sms.omni_normtest(results.resid)
lzip(name, test)

Out:

[('Chi^2', 3.713437811597181), ('Two-tail probability', 0.15618424580304824)]
0
yanniskatsaros On

If you are looking for a variety of (scaled) residuals such as externally/internally studentized residuals, PRESS residuals and others, take a look at the OLSInfluence class within statsmodels.

Using the results (a RegressionResults object) from your fit, you instantiate an OLSInfluence object that will have all of these properties computed for you. Here's a short example:

import statsmodels.api as sm
from statsmodels.stats.outliers_influence import OLSInfluence

data = sm.datasets.spector.load(as_pandas=False)
X = data.exog
y = data.endog

# fit the model
model = sm.OLS(y, sm.add_constant(X, prepend=False))
fit = model.fit()

# compute the residuals and other metrics
influence = OLSInfluence(fit)