Different Robust Standard Errors of Logit Regression in Stata and R

6.8k views Asked by At

I am trying to replicate a logit regression from Stata to R. In Stata I use the option "robust" to have the robust standard error (heteroscedasticity-consistent standard error). I am able to replicate the exactly same coefficients from Stata, but I am not able to have the same robust standard error with the package "sandwich".

I have tried some OLS linear regression examples; it seems like the sandwich estimators of R and Stata give me the same robust standard error for OLS. Does anybody know how Stata calculate the sandwich estimator for non-linear regression, in my case the logit regression?

Thank you!

Codes Attached: in R:

library(sandwich)
library(lmtest)    
mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv")    
mydata$rank<-factor(mydata$rank)    
myfit<-glm(admit~gre+gpa+rank,data=mydata,family=binomial(link="logit"))    
summary(myfit)    
coeftest(myfit, vcov = sandwich)    
coeftest(myfit, vcov = vcovHC(myfit, "HC0"))    
coeftest(myfit, vcov = vcovHC(myfit))    
coeftest(myfit, vcov = vcovHC(myfit, "HC3"))    
coeftest(myfit, vcov = vcovHC(myfit, "HC1"))    
coeftest(myfit, vcov = vcovHC(myfit, "HC2"))    
coeftest(myfit, vcov = vcovHC(myfit, "HC"))    
coeftest(myfit, vcov = vcovHC(myfit, "const"))    
coeftest(myfit, vcov = vcovHC(myfit, "HC4"))    
coeftest(myfit, vcov = vcovHC(myfit, "HC4m"))    
coeftest(myfit, vcov = vcovHC(myfit, "HC5"))    

Stata:

use http://www.ats.ucla.edu/stat/stata/dae/binary.dta, clear    
logit admit gre gpa i.rank, robust    
1

There are 1 answers

3
Achim Zeileis On BEST ANSWER

The default so-called "robust" standard errors in Stata correspond to what sandwich() from the package of the same name computes. The only difference is how the finite-sample adjustment is done. In the sandwich(...) function no finite-sample adjustment is done at all by default, i.e., the sandwich is divided by 1/n where n is the number of observations. Alternatively, sandwich(..., adjust = TRUE) can be used which divides by 1/(n - k) where k is the number of regressors. And Stata divides by 1/(n - 1).

Of course, asymptotically these do not differ at all. And except for a few special cases (e.g., OLS linear regression) there is no argument for 1/(n - k) or 1/(n - 1) to work "correctly" in finite samples (e.g., unbiasedness). At least not to the best of my knowledge.

So to obtain the same results as in Stata you can do do:

sandwich1 <- function(object, ...) sandwich(object) * nobs(object) / (nobs(object) - 1)
coeftest(myfit, vcov = sandwich1)

This yields

z test of coefficients:

              Estimate Std. Error z value  Pr(>|z|)    
(Intercept) -3.9899791  1.1380890 -3.5059 0.0004551 ***
gre          0.0022644  0.0011027  2.0536 0.0400192 *  
gpa          0.8040375  0.3451359  2.3296 0.0198259 *  
rank2       -0.6754429  0.3144686 -2.1479 0.0317228 *  
rank3       -1.3402039  0.3445257 -3.8900 0.0001002 ***
rank4       -1.5514637  0.4160544 -3.7290 0.0001922 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

And just for the record: In the binary response case, these "robust" standard errors are not robust against anything. Provided that the model is correctly specified, they are consistent and it's ok to use them but they don't guard against any misspecification in the model. Because the basic assumption for the sandwich standard errors to work is that the model equation (or more precisely the corresponding score function) is correctly specified while the rest of the model may be misspecified. However, in a binary regression there is no room for misspecification because the model equation just consists of the mean (= probability) and the likelihood is the mean and 1 - mean, respectively. This is in contrast to linear or count data regression where there may be heteroskedasticity, overdispersion, etc.