Inconsistent predictions from lmList and data.table

155 views Asked by DP5000 At 27 December 2016 at 21:11

I'm working on generating predictions from an lmList object holding some variables constant, but am finding inconsistent predictions between using a full cartesian product of data and using only a few rows. It doesn't seem to matter if I use lme4 or nlme versions of lmList.

library(data.table)
library(boot)
library(lme4)

set.seed(31)

# SET PRIMARY ATTRIBUTES
x2 <- c(-1.5800,-1.0660,-0.7640,-0.4970,-0.2395,0.0250,0.3140,0.6535,1.1350,2.0750)
x1 <- c(-1, 0, 1)
grp <- factor(letters[1:5])

# SET MEANS FOR EACH GROUP
y_grp_mn <- runif(5, min = 0.3, max = 0.6)

# EXPAND GRID
dt <- expand.grid(x1, x2, grp)
names(dt) <- c("x1", "x2", "grp")

# EXPAND DT
dt <- rbind(dt, dt, dt, dt, dt)

# SET BINARY OUTCOME WITH y_grp_mn PROBABILITIES
dt$y <- as.numeric(lapply(dt$grp, function(x) { 
  ifelse(rnorm(1, mean = y_grp_mn[as.numeric(x)]) > 0.5, 1, 0 )
  }))

# CONVERT TO DATA TABLE
dt <- data.table(dt)

# GENERATE A LINEAR MODEL AT EACH LEVEL GRP
glm_list <- lmList(y ~ x1 + x2 | grp,
                   family = binomial(link = "logit"),
                   data = dt)


# CREATE A SIMPLE TEST DATASET FROM MEANS AT EACH LEVEL OF X1 
# GET MEANS AT EACH LEVEL OF GRP x X2
grp_means <- dt[, .(mn = mean(y)), by = .(grp, x2)]
# REPLICATE AT X1
pred_data <- rbind(data.table(grp_means, x1 = -1),
                        data.table(grp_means, x1 = 0),
                        data.table(grp_means, x1 = 1))

# GENERATE PREDICTIONS FOR TEST DATA
pred_data[ , preds := inv.logit(predict(glm_list, as.data.frame(pred_data)))]

# EXAMINE PREDICTIONS AT LEVELS OF X1 HOLDING GRP AND X2 CONSTANT 
pred_data[c(50,100,150),]
# grp    x2  mn x1     preds
# 1:   e 2.075 0.6 -1 0.4689207
# 2:   e 2.075 0.6  0 0.3400327
# 3:   e 2.075 0.6  1 0.6517386

# TRY TO REPLICATE PREDICTED VALUES ON SAME SUBSET OF FULL DATA
inv.logit(predict(glm_list, pred_data[c(50,100,150),]))
# e         e         e 
# 0.4725348 0.5642330 0.6517386

As shown only the third row of the subset replicated. Is there something I'm missing?

Thanks.

Original Q&A

TechQA.

Inconsistent predictions from lmList and data.table

There are 0 answers

Related Questions in LME4

Related Questions in NLME

Popular Questions

Trending Questions