Statsmodels VARMAX: confidence / predication intervals with more than one endogenous variable

594 views Asked by At

I am trying to recover confidence/prediction intervals in Python Statsmodels (Version 0.12.1) with two or more endogenous (y) variables, as is common in VARMAX. The following example correctly predicts the in-sample and out-sample means for two endogenous varialbes. But the in-sample and out-sample confidence intervals are returned only for the first endogenous variable, dln_inv. I would like to know how to recover the confidence interval for the second variable, dln_inc, as well. I would appreciate any help.

import numpy as np
import statsmodels.api as sm
from statsmodels.tsa.api import VARMAX
import warnings
warnings.filterwarnings("ignore")

dta = sm.datasets.webuse('lutkepohl2', 'https://www.stata-press.com/data/r12/')
dta.index = dta.qtr
dta.index.freq = dta.index.inferred_freq
subset = dta.loc['1960-04-01':'1978-10-01', ['dln_inv', 'dln_inc', 'dln_consump']]
endog = subset[['dln_inv', 'dln_inc']]  # notice two endogenous variables
exog = subset['dln_consump']

p = int(0)
q = int(1)

model = VARMAX(endog, exog=exog, order=(int(p),int(q))).fit(maxiter=100,disp=False)

in_sample_predicted = model.get_prediction()
in_sample_predicted_means = in_sample_predicted.predicted_mean
# the following command seems to produce the confidence interval for the first endogenous variable, dln_inv
in_sample_CI = in_sample_predicted.summary_frame(alpha=0.05) 

n_periods = 5
exog_preforecast = exog + exog * np.random.normal(0,0.5,exog.shape)
out_sample_forecast = model.get_forecast(steps=n_periods,exog=exog_preforecast[-n_periods:])
out_sample_forecast_means = out_sample_forecast.predicted_mean
# the following command seems to produce the confidence interval for the first endogenous variable, dln_inv
out_sample_CI = out_sample_forecast.summary_frame(alpha=0.05) 
1

There are 1 answers

0
cfulton On

There are two ways to get confidence intervals for all variables.

First, if you are using the summary_frame method, you can pass the integer index of variable you want to retrieve intervals for, using the endog argument (which seems not to be in the docstring, unfortunately).

summary_dln_inv = out_sample_forecast.summary_frame(endog=0, alpha=0.05) 
summary_dln_inc = out_sample_forecast.summary_frame(endog=1, alpha=0.05) 

Second, you can retrieve invervals for all variables at once, using the conf_int method:

all_CI = out_sample_forecast.conf_int(alpha=0.05)

Which yields the following DataFrame output:

            lower dln_inv  lower dln_inc  upper dln_inv  upper dln_inc
1979-01-01      -0.067805       0.011456       0.101923       0.050345
1979-04-01      -0.081301      -0.007333       0.095298       0.034796
1979-07-01      -0.080236      -0.006666       0.096362       0.035463
1979-10-01      -0.087785      -0.011397       0.088813       0.030732
1980-01-01      -0.085402      -0.009903       0.091197       0.032226