Given this dataset
df=pd.DataFrame({'year':[2000,2000,2000,2000,2000,2001,2001,2001,2001,2001,2002,2002,2002,2002,2002],'metric':[2,3,4,5,6,12,13,14,15,16,22,23,24,25,26]})
running quantile regression for the 0.5 quantile using the Statsmodels package
model=smf.quantreg('metric ~ year', df)
result=model.fit(q=0.5, vcov='robust', kernel='epa', bandwidth='hsheather', max_iter=1000, p_tol=1e-06)
results in this outcome:
SAS' PROC QUANTREG, however, produces
I'm confused by the Statsmodels' coefficient. Given that the medians for the 3 years are 4, 14, and 24 shouldn't the coefficient be 10 like SAS'? Changing the kernel and bandwidth options doesn't affect it.
I do see the "condition number is large" message for Statsmodels. If I were to address this by normalizing the dataset such that the years are 0, 0.5, and 1 then I do get a coefficient of 20 which is what SAS produces as well.
Why does statsmodels producs a different coefficient for non-normalized data?