Clustered standard errors with imputed and weighted data in R

228 views Asked by At

I am attempting to get clustered SEs (at the school level in my data) with data that is both imputed (MICE) and weighted (CBPS). I have tried a couple different approaches that have thrown different errors.

This is what I have to start, which works fine:

library(tidyverse)
library(mice)
library(MatchThem)
library(CBPS)

tempdata <- mice(d, m = 10, maxit = 50, meth = "pmm", seed = 99)

weighted_data <- weightthem(trtmnt ~ x1 + x2 + x3,
                            data = tempdata,
                            method = "cbps",
                            estimand = "ATT")

Using this (https://www.r-bloggers.com/2021/05/clustered-standard-errors-with-r/) as a guide, I attempted all 3, which all resulted in various types of error messages.

My data is in a restricted server so unfortunately I can't bring it into here to reproduce things exactly, although if it's useful I could attempt to recreate some sample data.

So attempting with estimatr first, I get this error:

m1 <- estimatr::lm_robust(outcome ~ trtmnt + x1 + x2 + x3,
                          clusters = schoolID,
                          data = weighted_data)

Error in eval_tidy(mfargs[[da]], data = data) :
  object 'schoolID' not found

I have no clue where the schoolID variable would have dropped out/not be recognized. It isn't part of the weighting procedure but it should still be in the data frame...if I use it as a covariate in a standard model without clustering, it's there.

I also attempted with miceadds and got this error:

m2 <- miceadds::lm.cluster(outcome ~ trtmnt + x1 + x2 + x3,
                           cluster = "schoolID",
                           data = weighted_data)

Error in as.data.frame.default(data) : 
  cannot coerce class `"wimids"` to a data.frame

And finally, with sandwich and lmtest:

library(sandwich)
library(lmtest)

m3 <- weighted_models <- with(weighted_data,
                              exp=lm(outcome ~ trtmnt + x1 + x2 + x3))

msandwich <- coeftest(m3, vcov = vcovCL, cluster = ~schoolID)

Error in UseMethod("estfun") :
   no applicable method for `estfun` applied to an object of class "c(`mimira`, `mira`)"

Any ideas on any of the above methods, or where to go next?

1

There are 1 answers

0
Noah On

You were really close. You need to use with(weighted_data, .) to fit a model in your weighted datasets, and you need to use estimatr::lm_robust() to get the clustered standard errors. So try the following:

weighted_models <- with(weighted_data,
                        estimatr::lm_robust(outcome ~ trtmnt + x1 + x2 + x3,
                                            cluster = schoolID))

Your first and second approaches were incorrect because you supplied weighted_data to a single model as if it were a data frame, but it's not; it's a complicated wimids object. You need to use the with() infrastructure to fit a model to the imputed weighted data.

Your third approach was close, but coeftest() needs to be used on a single model, not a mimira object, which contains all the models fit the imputed datasets. Although you can use coeftest() inside with() with mira objects, you cannot do so with mimira objects from MatchThem. This is where estimatr::lm_robust() comes in since it is able to apply the clustering within each imputed dataset.

I also recommend you take a look at this blog post on estimating treatment effects after weighting with multiply imputed data. The only difference in your case to the code presented in the post is that you would change vcov = "HC3" to vcov = ~schoolID in whichever function you use.