Folks,
I'm trying to fit a model in R using lme() from the {nlme} package with many fixed effects and additional random effects (gotta be {nlme} since I also want to include an AR(1) correlation matrix later on). However, the model is terribly slow (due to the huge number of fixed effects).
Now, we know that running a model with dummies is numerically, but not computationally, equivalent to running a model on the de-meanded data (within estimation). Hence, the following two models lead to equivalent results, but the first one runs much faster:
# Load library
library(nlme) # For lme()
library(lfe) # For demeanlist()
# Create reproducible dummy dataset
set.seed(1)
dat <- data.frame(y=rnorm(1000), x=rnorm(1000), cluster1=sample(LETTERS, 1000, replace=TRUE), cluster2=sample(LETTERS, 1000, replace=TRUE))
# Create de-meaned dataset
dat_demean <- demeanlist(dat[,c("y", "x")], list(dat$cluster1))
dat_demean$cluster2 <- dat$cluster2 # Copy cluster2 column over to de-meaned data
# Model 1: Fixed effect on cluster1
orig <- lm( y ~ x + cluster1, data=dat)
deme <- lm( y ~ x - 1, data=dat_demean)
# This works as expected. Coefficients are exactly identical (SE are wrong and need to be fixed).
all.equal( coef(orig)["x"], coef(deme)["x"] )
R> [1] TRUE
Can something similar be done when we want to additionally include random effects on the second grouping variable? Below is my attempt which does not work.
# Model 2: Fixed effect on cluster1 and additional random effect for cluster2
orig <- lme( y ~ x + cluster1, random=~1|cluster2, data=dat)
deme <- lme( y ~ x -1, random=~1|cluster2, data=dat_demean)
# This does not work - coefficients are different
fixef(orig)["x"]
R> x
R> -0.005885881
fixef(deme)["x"]
R> x
R> -0.006169549
all.equal(fixef(orig)["x"], fixef(deme)["x"])
R> [1] "Mean relative difference: 0.04819478"
What am I missing or what am I doing wrong? Or is it just not that simply possible to fit random effects models on de-meaned data?
Cheers
I know this is an old question now, but I just caught it. I believe your estimates differ to that extent in the second example because you have not modeled the between cluster variance. In general, this approach will provide similar estimates if the between cluster variance is not too large, it also adds some parametric assumptions about the random effects in addition to linear fe approach.
The example below uses
parameters::demean
for convenience (it has a helpful naming convention).