I'm sorry if this is a dumb question but I've been trying to extract the r squared values from an lmList model object for a few days no with no success. I'm running a custom bootstrap function to allow me to cross validate my results and I'm able to successfully pull out coefficients and RMSE.
GroupedBootStrampling <- function(slice, root, data, iters, met) {
print('Initializing Modeling')
root <- as.vector(root[slice, ])
coefs <- list()
RMSE <- list()
r_sq <- list()
mods <- list()
for (i in 1:iters){
print('Splitting Data into training and testing sets')
comb <- splitter(data,0.8)
train <- comb[[1]]
test <- comb[[2]]
print('Data sucessfully split')
# Create var functional form to pass to lmList model object
train$var <- train$cpc**(1/root[, 1])*train$cost**(1/root[, 2])
test$var <- test$cpc**(1/root[, 1])*test$cost**(1/root[, 2])
train[,'temp'] <- train[,met]
# Use lmList to train and test the models
print('Creating Models for each campaign')
model <- lmList(temp ~ -1 + var + I(var*Mon) + I(var*Tues) + I(var*Wed) + I(var*Thurs) + I(var*Fri) + I(var*Sat) | grouper , data = train)
#model <- as.formula("temp ~ -1 + var + I(var*Mon) + I(var*Tues) + I(var*Wed) + I(var*Thurs) + I(var*Fri) + I(var*Sat)")
#model <- ridgeList(train, model, "grouper")
mods[[i]] <- model
# Predict clicks based on the Test dataset
print('Generating predictions')
preds <- predict(model, test, se.fit = T)
preds$RMSE <- as.numeric((test[,met] - preds$fit)**2)
# Predict RMSE off by subtracting the actual clicks from the predicted clicks for the test dataset
RMSE[[i]] <- as.data.frame(preds %>% group_by(grouper) %>% summarize(RMSE = (mean(RMSE)**0.5)))
# extract the coefficients from the model
coefs[[i]] <- data.frame(groups = unique(train$grouper), coef(model, augFrame = T, data = train, which = 'grouper'))
# extract the r.squared
r_sq[[i]] <- summary(model)$r.squared
}
# Pick out average RMSE and Average coefs
avg_RMSE <- do.call(rbind,RMSE)
avg_coefs <- do.call(rbind,coefs)
avg_r.sq <- do.call(rbind,r_sq)
avg_coefs$RMSE <- avg_RMSE$RMSE
avg_coefs$r_sq <- avg_r.sq
# Create a dataframe of deisred output to be returned from the function
print('Collecting Results')
outs <- avg_coefs %>% group_by(groups) %>% summarize(var_coef = mean(var),
int_mon_coef = mean(I.var...Mon.),
int_tues_coef = mean(I.var...Tues.),
int_wed_coef = mean(I.var...Wed.),
int_thurs_coef = mean(I.var...Thurs.),
int_fri_coef = mean(I.var...Fri.),
int_sat_coef = mean(I.var...Sat.),
RMSE = mean(RMSE),
r_sq = mean(r_sq))
outs$cpc_root <- root[,1]
outs$cost_root <- root[,2]
outs <- na_killer(outs)
# outs <- pred_optimizer2(outs, data)
print('Loop Complete')
return(outs)
}
However when I attempt to run this function which has worked in the past, prior to me adding in the r-squared calculation, I receive the following error:
Error in `[<-`(`*tmp*`, use, use, ii, value = lst[[ii]]) :
subscript out of bounds
Which when traced back points to the r.squared calculation. I've tried various combinations for the subscript and I still can't seem to get it. If I generate an lmList outside of the iteration loop I can use summary(model)$r.squared
and it works without any issues.
If anyone has any ideas I'd really appreciate it!