Why is drop1 ignoring linear terms for mixed models?

1.3k views Asked by At

I have six fixed factors: A, B, C, D, E and F, and one random factor R. I want to test linear terms, pure quadratic terms and two-way interactions using language R. So, I constructed the full linear mixed model and tried to test its terms with drop1:

full.model <- lmer(Z ~ A + B + C + D + E + F
                     + I(A^2) + I(B^2) + I(C^2) + I(D^2) + I(E^2) + I(F^2)
                     + A:B + A:C + A:D + A:E + A:F
                           + B:C + B:D + B:E + B:F
                                 + C:D + C:E + C:F 
                                       + D:E + D:F
                                             + E:F
                     + (1 | R), data=mydata, REML=FALSE)
drop1(full.model, test="Chisq")

It seems that drop1 is completely ignoring linear terms:

Single term deletions

Model:
Z ~ A + B + C + D + E + F + I(A^2) + I(B^2) + I(C^2) + I(D^2) + 
    I(E^2) + I(F^2) + A:B + A:C + A:D + A:E + A:F + B:C + B:D + 
    B:E + B:F + C:D + C:E + C:F + D:E + D:F + E:F + (1 | R)
       Df    AIC     LRT   Pr(Chi)    
<none>    127177                      
I(A^2)  1 127610  434.81 < 2.2e-16 ***
I(B^2)  1 127378  203.36 < 2.2e-16 ***
I(C^2)  1 129208 2032.42 < 2.2e-16 ***
I(D^2)  1 127294  119.09 < 2.2e-16 ***
I(E^2)  1 127724  548.84 < 2.2e-16 ***
I(F^2)  1 127197   21.99 2.747e-06 ***
A:B     1 127295  120.24 < 2.2e-16 ***
A:C     1 127177    1.75  0.185467    
A:D     1 127240   64.99 7.542e-16 ***
A:E     1 127223   48.30 3.655e-12 ***
A:F     1 127242   66.69 3.171e-16 ***
B:C     1 127180    5.36  0.020621 *  
B:D     1 127202   27.12 1.909e-07 ***
B:E     1 127300  125.28 < 2.2e-16 ***
B:F     1 127192   16.60 4.625e-05 ***
C:D     1 127181    5.96  0.014638 *  
C:E     1 127298  122.89 < 2.2e-16 ***
C:F     1 127176    0.77  0.380564    
D:E     1 127223   47.76 4.813e-12 ***
D:F     1 127182    6.99  0.008191 ** 
E:F     1 127376  201.26 < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

If I exclude interactions from the model:

full.model <- lmer(Z ~ A + B + C + D + E + F
                     + I(A^2) + I(B^2) + I(C^2) + I(D^2) + I(E^2) + I(F^2)
                     + (1 | R), data=mydata, REML=FALSE)
drop1(full.model, test="Chisq")

then the linear terms get tested:

Single term deletions

Model:
Z ~ A + B + C + D + E + F + I(A^2) + I(B^2) + I(C^2) + I(D^2) + 
    I(E^2) + I(F^2) + (1 | R)
       Df    AIC    LRT   Pr(Chi)    
<none>    127998                     
A       1 130130 2133.9 < 2.2e-16 ***
B       1 130177 2181.0 < 2.2e-16 ***
C       1 133464 5467.6 < 2.2e-16 ***
D       1 129484 1487.9 < 2.2e-16 ***
E       1 130571 2575.0 < 2.2e-16 ***
F       1 128009   12.7 0.0003731 ***
I(A^2)  1 128418  422.2 < 2.2e-16 ***
I(B^2)  1 128193  197.4 < 2.2e-16 ***
I(C^2)  1 129971 1975.1 < 2.2e-16 ***
I(D^2)  1 128112  115.6 < 2.2e-16 ***
I(E^2)  1 128529  533.0 < 2.2e-16 ***
I(F^2)  1 128017   21.3 3.838e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
1

There are 1 answers

4
Ben Bolker On BEST ANSWER

Because this is the way drop1 works (it's not specific to mixed models - you would find this behaviour for a regular linear model fitted with lm as well). From ?drop1:

The hierarchy is respected when considering terms to be added or dropped: all main effects contained in a second-order interaction must remain, and so on.

I discuss this at some length in this CrossValidated post

The statistically tricky part is that testing lower-level interactions in a model that also contains higher-level interactions is (depending on who you talk to) either (i) hard to do correctly or (ii) just plain silly (for the latter position, see part 5 of Bill Venables's "exegeses on linear models"). The rubric for this is the principle of marginality. At the very least, the meaning of the lower-order terms depends sensitively on how contrasts in the model are coded (e.g. treatment vs. midpoint/sum-to-zero). My default rule is that if you're not sure you understand exactly why this might be a problem, you shouldn't violate the principle of marginality.

However, as Venables actually describes in the linked article, you can get R to violate marginality if you want (p. 15):

To my delight I see that marginality constraints between factor terms are by default honoured and students are not led down the logically slippery ‘Type III sums of squares’ path. We discuss why it is that no main effects are shown, and it makes a useful tutorial point.

The irony is, of course, that Type III sums of squares were available all along if only people understood what they really were and how to get them. If the call to drop1 contains any formula as the second argument, the sections of the model matrix corresponding to all non-intercept terms are omitted seriatim from the model, giving some sort of test for a main effect ...

Provided you have used a contrast matrix with zero-sum columns they will be unique, and they are none other than the notorious ‘Type III sums of squares’. If you use, say, contr.treatment contrasts, though, so that the columns do not have sum zero, you get nonsense. This sensitivity to something that should in this context be arbitrary ought to be enough to alert anyone to the fact that something silly is being done.

In other words, using scope = . ~ . will force drop1 to ignore marginality. You do this at your own risk - you should definitely be able to explain to yourself what you're actually testing when you follow this procedure ...

For example:

set.seed(101)
dd <- expand.grid(A=1:10,B=1:10,g=factor(1:10))
dd$y <- rnorm(1000)
library(lme4)
m1 <- lmer(y~A*B+(1|g),data=dd)
drop1(m1,scope=.~.)
## Single term deletions
## 
## Model:
## y ~ A * B + (1 | g)
##        Df    AIC
## <none>    2761.9
## A       1 2761.7
## B       1 2762.4
## A:B     1 2763.1