I am comparing SARIMAX fitting results between R (3.3.1) forecast package (7.3) and Python's (3.5.2) statsmodels (0.8).
The R-code is:
library(forecast)
data("AirPassengers")
Arima(AirPassengers, order=c(2,1,1), seasonal=list(order=c(0,1,0),
period=12))$aic
[1] 1017.848
The Python code is:
from statsmodels.tsa.statespace import sarimax
import pandas as pd
AirlinePassengers =
pd.Series([112,118,132,129,121,135,148,148,136,119,104,118,115,126,
141,135,125,149,170,170,158,133,114,140,145,150,178,163,
172,178,199,199,184,162,146,166,171,180,193,181,183,218,
230,242,209,191,172,194,196,196,236,235,229,243,264,272,
237,211,180,201,204,188,235,227,234,264,302,293,259,229,
203,229,242,233,267,269,270,315,364,347,312,274,237,278,
284,277,317,313,318,374,413,405,355,306,271,306,315,301,
356,348,355,422,465,467,404,347,305,336,340,318,362,348,
363,435,491,505,404,359,310,337,360,342,406,396,420,472,
548,559,463,407,362,405,417,391,419,461,472,535,622,606,
508,461,390,432])
AirlinePassengers.index = pd.DatetimeIndex(end='1960-12-31',
periods=len(AirlinePassengers), freq='1M')
print(sarimax.SARIMAX(AirlinePassengers,order=(2,1,1),
seasonal_order=(0,1,0,12)).fit().aic)
Which throws an error: ValueError: Non-stationary starting autoregressive parameters found with enforce_stationarity
set to True.
If I set enforce_stationarity (and enforce_invertibility, which is also required) to False, the model fit works but AIC is very poor (>1400).
Using some other model parameters for the same data, e.g., ARIMA(0,1,1)(0,0,1)[12] I can get identical results from R and Python with stationarity and invertibility checks enabled in Python.
My main question is: What explains the difference in behavior with some model parameters? Are statsmodels' invertibility checks different from forecast's Arima and is the other somehow "more correct"?
I also found a pull request related to fixing an invertibility calculation bug in statsmodels: https://github.com/statsmodels/statsmodels/pull/3506
After re-installing statsmodels with the latest source code from Github, I still get the same error with the code above, but setting enforce_stationarity=False and enforce_invertibility=False I get aic of around 1010 which is lower than in the R case. But model parameters are also vastly different.