I have a dataframe with one column of a response variable, and several columns of predictor variables. I want to fit models for the response variable using each of the predictor variables separately, finally creating a dataframe that contains the coefficients of the model. Previously, I would have done this:
data(iris)
iris_vars <- c("Sepal.Width", "Petal.Length", "Petal.Width")
fits.iris <- lapply(iris_vars, function(x) {lm(substitute(Sepal.Length ~ i, list(i = as.name(x))), data = iris)})
# extract model coeffs, so forth and so on, eventually combining into a result dataframe
iris.p <- as.data.frame(lapply(fits.iris, function(f) summary(f)$coefficients[,4]))
iris.r <- as.data.frame(lapply(fits.iris, function(f) summary(f)$r.squared))
However, this seems a little cumbersome now that I have begun to use dplyr
, broom
, etc. Using purrr::map
I can more or less recreate this list of models:
# using purrr, still uses the Response variable "Sepal.Length" as a predictor of itself
iris %>%
select(1:4) %>%
# names(select(., 2:4)) %>% this does not work
names() %>%
paste('Sepal.Length ~', .) %>%
map(~lm(as.formula(.x), data = iris))
However, I am unsure how to get this list into an appropriate form to use with broom::tidy
. If I was doing using grouped rows, and not columns, I would store the model fits and use broom::tidy
to do something like this:
iris.fits <- group_by(Species) %>% do(modfit1 = lm(Sepal.Length~Sepal.Width,data=.))
tidy(iris.fits, modfit1)
Of course this isn't what I am doing, but I was hoping there was similar procedure when using columns of data. Is there way, perhaps to use purrr::nest
or something similar to create the desired output?
1) This gives the
glance
andtidy
output for the model fits:1a) or as a magrittr pipeline:
Either one gives the following matrix:
or transposed:
2) If we remove the first unlist from the
glance_tidy
function definition then we get a 2d list (rather than a 2d numeric matrix):