I have a question regarding the extraction of the (raw) model matrix of random effects from models fitted with lmer
(lme4
) in R.
More specifically, I want to obtain a data frame or a matrix that contains all variables that are involved in random effects terms.
The matter is further complicated because some entries of that matrix are zero.
I usually extracted these matrices by accessing the sparse model matrix (Zt
) via getME
, after which I converted it to a regular matrix via its dimensions (see below).
However, this leads to problems whenever the (raw) model matrix contains zeros because Zt
only contains the nonzero elements.
Here is an example, a simple mixed effects model where x1
is normal and x2
contains five values that are exactly zero:
id <- rep(1:20,each=5)
y <- rnorm(100)
x1 <- rnorm(100)
x2 <- c(rep(0,5),rnorm(95))
df <- data.frame(id,x1,x2,y)
I fit two models using lmer
, one with x1
the other with x2
as a predictor:
library(lme4)
m1 <- lmer(y~1+x1+(1+x1|id), data=df)
m2 <- lmer(y~1+x2+(1+x2|id), data=df)
Here, I access the Zt
slot of the fitted model object.
The code below demonstrates that Zt
doesn't contain the zero values in x2
.
As a result, my very simple conversion into a regular matrix throws an error.
# length okay
length(getME(m1,"Zt")@x)
# model matrix okay
mm1 <- matrix(getME(m1,"Zt")@x, ncol=2, byrow=T)
# too short
length(getME(m2,"Zt")@x)
# gives error on model matrix
mm2 <- matrix(getME(m2,"Zt")@x, ncol=2, byrow=T)
Here is what I thought I can do instead. lmer
seems to save the raw matrices as well, which appears to work well as long as there is only one cluster variable.
# seems to work okay
mm3 <- getME(m2,"mmList")[[1]]
However, the mmList
slot is poorly documented online, and I barely find mentioning that people use it for programming.
Accessing Zt
seems by far the more common option.
Is it possible to construct the model matrix of random effects from Zt
even if the raw model matrix contains zeros?
If not, then what should I expect from mmList
?
If
mmList
is there, then it's not going away (however poorly documented it may be -- feel free to suggest documentation improvements ...). How about(which would seem to generalize correctly for multi-term models)?
I agree that it's a bit of a pain that
Zt
doesn't distinguish correctly between structural and non-structural zeroes -- it might be possible to change the underlying code to make this work if it were sufficiently important, but I think it would be hard enough that we'd need a pretty compelling use case ...