emmeans - can you use a baseline measure in a contrast (i.e. as a different variable)?

431 views Asked by At

Best practice when analysing data from an RCT is to adjust for the baseline measure (ancova). However, researchers often still ask for change from baseline in each group and their relative difference (given by the treatment x time interaction). When one adjusts for the baseline measure, contrasts involving time can only be made from the first post-baseline measure.

Is there a way to perform a contrast where you subtract the model prediction at each post-treatment time point from the baseline measure (e.g. the overall pre-treatment mean of the combined groups)? I could do this manually but wouldn’t get C.I.'s.

Some dummy data are below:

> m1 <- lme4::lmer(measure ~ baseline + treat*time + (1|id), data = dat)
> summary(m1)
Linear mixed model fit by REML ['lmerMod']
Formula: measure ~ baseline + treat * time + (1 | id)
   Data: dat

REML criterion at convergence: 436.9

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.33301 -0.35625 -0.00052  0.41946  1.55502 

Random effects:
 Groups   Name        Variance Std.Dev.
 id       (Intercept) 11.379   3.373   
 Residual              7.795   2.792   
Number of obs: 80, groups:  id, 40

Fixed effects:
             Estimate Std. Error t value
(Intercept)   -1.7660     4.1216  -0.428
baseline       0.6898     0.1257   5.486
treatB         1.7293     1.4064   1.230
time3          7.8969     0.8829   8.944
treatB:time3  -5.9876     1.2486  -4.795

Correlation of Fixed Effects:
            (Intr) baseln treatB time3 
baseline    -0.971                     
treatB      -0.335  0.175              
time3       -0.107  0.000  0.314       
treatB:tim3  0.076  0.000 -0.444 -0.707

What I would like to do in each case below is subtract the baseline from the predicted mean. I am guessing the reference grid doesn't work this way, but thought I would ask anyway.

> mb <- mean(dat$baseline[dat$time == 2]) 
> mod_new_rg <- ref_grid(m1, at = list(baseline = mb))
> emmeans(mod_new_rg, ~ baseline + time + treat)
 baseline time treat emmean    SE   df lower.CL upper.CL
     30.9 2    A       19.5 0.987 54.5     17.5     21.5
     30.9 3    A       27.4 0.987 54.5     25.4     29.4
     30.9 2    B       21.3 0.987 54.5     19.3     23.2
     30.9 3    B       23.2 0.987 54.5     21.2     25.1

Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 
dat <- structure(list(id = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 
5L, 5L, 6L, 6L, 7L, 7L, 8L, 8L, 9L, 9L, 10L, 10L, 11L, 11L, 12L, 
12L, 13L, 13L, 14L, 14L, 15L, 15L, 16L, 16L, 17L, 17L, 18L, 18L, 
19L, 19L, 20L, 20L, 21L, 21L, 22L, 22L, 23L, 23L, 24L, 24L, 25L, 
25L, 26L, 26L, 27L, 27L, 28L, 28L, 29L, 29L, 30L, 30L, 31L, 31L, 
32L, 32L, 33L, 33L, 34L, 34L, 35L, 35L, 36L, 36L, 37L, 37L, 38L, 
38L, 39L, 39L, 40L, 40L), .Label = c("1", "2", "3", "4", "5", 
"6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", 
"17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", 
"28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", 
"39", "40"), class = "factor"), treat = structure(c(1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), time = structure(c(2L, 3L, 2L, 3L, 2L, 
3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 
3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 
3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 
3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 
3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L), .Label = c("1", 
"2", "3"), class = "factor"), measure = c(17.3200614254647, 28.8156330714954, 
21.160508157976, 30.1996968351034, 15.8306421666642, 24.6343639611382, 
27.0257307671622, 35.280124692084, 21.5414605781997, 30.1710609087278, 
24.7492184706737, 29.1159089608813, 13.7075342092653, 19.3839948700176, 
30.9064084394, 43.0450835924821, 16.4621181393495, 27.9256720625141, 
22.067795128987, 30.1292791903279, 22.2987617525344, 28.6339573643738, 
10.0416232025695, 17.9790809598907, 22.3419910935755, 33.874536822989, 
16.0349555541677, 24.2475148150631, 18.9908539202081, 25.6010342912277, 
14.7967655238907, 19.390122202679, 22.5225831869562, 30.9891588158201, 
18.8686550222759, 28.6210275251346, 21.4007984220698, 25.9102099243138, 
25.9311028475919, 27.9896540407965, 25.6541596409646, 16.7652666970158, 
19.2697851662734, 23.9828522768379, 16.7909337557515, 21.4816672428583, 
22.19284590574, 28.8766659163021, 15.7183387191222, 26.3850724291503, 
23.3685103961168, 28.7964869754369, 34.2230257941104, 40.6329654650291, 
20.4357830783204, 18.3755342847755, 19.8702376058352, 20.3089453680716, 
8.32292449233602, 16.4392611472045, 26.7694252442986, 21.7984979391771, 
20.2887000047016, 20.091994056284, 23.0445006938799, 18.5579489512776, 
14.5007251728231, 14.2647965430465, 18.9590380042456, 21.8548073291541, 
16.8014625827141, 13.9396349007877, 23.0599924520126, 24.3582989818999, 
17.3356080916247, 21.5054270440033, 20.8988975305153, 24.5258518904388, 
24.069743228952, 26.8174125810013), baseline = c(30.5160390487568, 
30.5160390487568, 32.6451428104435, 32.6451428104435, 28.0425948684312, 
28.0425948684312, 37.5537250807658, 37.5537250807658, 32.000871324751, 
32.000871324751, 34.9337043558422, 34.9337043558422, 32.069431652421, 
32.069431652421, 41.3254959449604, 41.3254959449604, 22.680750564568, 
22.680750564568, 38.8196824239253, 38.8196824239253, 30.9474784100669, 
30.9474784100669, 28.3893633655831, 28.3893633655831, 30.5016640400354, 
30.5016640400354, 28.2643729861136, 28.2643729861136, 30.6368731962024, 
30.6368731962024, 28.5343771037685, 28.5343771037685, 31.8351023461854, 
31.8351023461854, 31.3123200573486, 31.3123200573486, 31.0422497831132, 
31.0422497831132, 34.8584162616575, 34.8584162616575, 35.7527843999141, 
35.7527843999141, 23.7345582720131, 23.7345582720131, 26.800749534566, 
26.800749534566, 24.5834704596406, 24.5834704596406, 30.7699429023659, 
30.7699429023659, 28.9525228089337, 28.9525228089337, 48.7738392068561, 
48.7738392068561, 32.4395246096931, 32.4395246096931, 27.0239536784171, 
27.0239536784171, 21.4698209122962, 21.4698209122962, 34.4119498713112, 
34.4119498713112, 29.5423378282026, 29.5423378282026, 35.9475237918695, 
35.9475237918695, 28.7394363441077, 28.7394363441077, 28.9798112731033, 
28.9798112731033, 29.0543497219897, 29.0543497219897, 28.3198356223554, 
28.3198356223554, 27.0389712071391, 27.0389712071391, 26.97212084762, 
26.97212084762, 28.4431727013629, 28.4431727013629)), class = "data.frame", row.names = c(NA, 
-80L))
1

There are 1 answers

1
Russ Lenth On BEST ANSWER

As I said in a comment, the baseline is already set at its mean by default, so you get exactly the same the adjusted means as shown in the OP:

> (EMM = emmeans(m1, ~ time + treat))
 time treat emmean    SE   df lower.CL upper.CL
 2    A       19.5 0.987 54.5     17.5     21.5
 3    A       27.4 0.987 54.5     25.4     29.4
 2    B       21.3 0.987 54.5     19.3     23.2
 3    B       23.2 0.987 54.5     21.2     25.1

Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 

If you want to subtract the baseline mean from each adjusted mean in EMM, you can do it with the offset argument in contrast:

> mb <- mean(dat$baseline)
> CHG = contrast(EMM, "identity", offset = -mb, estName = "EMM - baseline")
> confint(CHG)
 contrast EMM - baseline    SE   df lower.CL upper.CL
 2 A              -11.34 0.987 54.5   -13.32    -9.36
 3 A               -3.44 0.987 54.5    -5.42    -1.47
 2 B               -9.61 0.987 54.5   -11.59    -7.63
 3 B               -7.70 0.987 54.5    -9.68    -5.73

Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 

However, this does not take the error in estimating the baseline mean into account, so the SEs are too optimistic. Obviously, you can specify other offsets, or a vector of 4 different offsets, as suits your purposes.

Another thing that may or may not be useful is to remove baseline from the model:

> emmeans(m1, ~ time + treat, submodel = ~ treat*time)
 time treat emmean    SE   df lower.CL upper.CL
 2    A       20.2 0.979 54.8     18.2     22.2
 3    A       28.1 0.979 54.8     26.1     30.1
 2    B       20.6 0.979 54.8     18.6     22.5
 3    B       22.5 0.979 54.8     20.5     24.5

submodel: ~ treat + time + treat:time 
Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95

... and then perhaps doing some offset manipulations as above. This is only an approximation of what you'd get by fitting the model without the baseline and then running emmeans(); it'd be exact if you'd used lm() rather than lmer().

Addendum

One problem here in CHG is that we have a contrast variable instead of time and treat. That should be fixable via levels(CHG) = levels(EMM), however, I just learned that the levels<- method has a bug and does not copy offsets. So a hack is needed:

> g = CHG@grid
> levels(CHG) = levels(EMM)
> CHG@grid$.offset. = g$.offset.
> confint(CHG)
 time treat EMM - baseline    SE   df lower.CL upper.CL
 2    A             -11.34 0.987 54.5   -13.32    -9.36
 3    A              -3.44 0.987 54.5    -5.42    -1.47
 2    B              -9.61 0.987 54.5   -11.59    -7.63
 3    B              -7.70 0.987 54.5    -9.68    -5.73

Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95  

I have fixed this for the next update (code had .offset instead of .offset., trailing dot omitted).