Quantile regression in SAS vs Python

55 views Asked by At

Given this dataset

df=pd.DataFrame({'year':[2000,2000,2000,2000,2000,2001,2001,2001,2001,2001,2002,2002,2002,2002,2002],'metric':[2,3,4,5,6,12,13,14,15,16,22,23,24,25,26]})

running quantile regression for the 0.5 quantile using the Statsmodels package

model=smf.quantreg('metric ~ year', df)
result=model.fit(q=0.5, vcov='robust', kernel='epa', bandwidth='hsheather', max_iter=1000, p_tol=1e-06)

results in this outcome:

enter image description here

SAS' PROC QUANTREG, however, produces enter image description here

I'm confused by the Statsmodels' coefficient. Given that the medians for the 3 years are 4, 14, and 24 shouldn't the coefficient be 10 like SAS'? Changing the kernel and bandwidth options doesn't affect it.

I do see the "condition number is large" message for Statsmodels. If I were to address this by normalizing the dataset such that the years are 0, 0.5, and 1 then I do get a coefficient of 20 which is what SAS produces as well.

Why does statsmodels producs a different coefficient for non-normalized data?

0

There are 0 answers