I have a long list of variables and I would like to calculate differences in survival (p values) for each one of those variables. I use the survfit() and surv_pvalue() functions to get the result, but I'm facing some issues on looping over variables
library(survminer)
set.seed(2020)
data <- data.frame(Months = 10 + rnorm(1:10),
Status = c(rep((0),5),rep((1),5)),
clin = rep("bla bla", 10),
Var1 = sample(0:1, 10, replace=T,prob=c(0.5,0.5)),
Var2 = sample(0:1, 10, replace=T,prob=c(0.5,0.5)),
Var3 = sample(0:1, 10, replace=T,prob=c(0.5,0.5)))
fit.list <- list()
for (i in (4:ncol(data))){
fit <- survfit(Surv(Months, Status) ~ colnames(data)[3+i], data = data)
fit2 <- surv_pvalue(fit)
fit.list[[i]] <- fit2
}
results in:
Error in model.frame.default(formula = Surv(Months, Status) ~ colnames(data)[3 + :
variable lengths differ (found for 'colnames(data)[3 + i]')
likely meaning that there is a discordance between the lengths of 4:ncol(data)
and colnames(data)[3 + i]
, but how exactly do I have to specify them? Thank you in advance for the solutions!
You could use
lapply
instead of iterating and appending to a list: