Selecting different columns of a data frame to run ANOVA in a for loop in R?

762 views Asked by At

I am trying to run ANOVA for binomially distributed data in a data frame named MegaData whose first 4 columns are categorical variable with Unit, Year and species. Below is my R code which isn't working when I try to loop across different columns of the data frame to run my model.

mod <- list()
for (i in MegaData[,5:36]) {
  for(j in length(MegaData[,5:36])){
    mod[[j]] <- glm(i/number ~ Unit*BeginYear*species_raw,
    family = binomial(link = logit), weight=number, 
    data = MegaData)
    print(anova(mod[[j]]), test="Chisq")
    print(summary(mod[[j]]))
  }
}
1

There are 1 answers

0
josliber On

If you are trying to train one model for each column in 5:36, obtaining a list of fitted models and printing out the summary of each model, you could try:

mod <- list()
for (j in 5:36) {
  mod[[j]] <- glm(paste0(names(MegaData)[j], "/number~Unit*BeginYear*species_raw"),
                  family = binomial(link = logit), weight=number, data = MegaData)
  print(anova(mod[[j]]), test="Chisq")
  print(summary(mod[[j]]))
}

Data:

set.seed(144)
MegaData <- data.frame(number=sample(1001:2000, 1000, replace=TRUE), Unit=sample(1:10, 1000, replace=TRUE), BeginYear=sample(2000:2010, 1000, replace=TRUE), species_raw=sample(letters[1:3], 1000, replace=TRUE))
for (i in 1:32) {MegaData[[paste0("dat", i)]] <- sample(1:1000, 1000, replace=TRUE)}