How to map over dataframe, is it a tidyeval error?

151 views Asked by At

Want to map over columns in a dataframe & perform t-tests with each column against a fixed column. Desired output would be a dataframe with each row(s) being t-test results - can use map_dfr once mapping process ok

Dug into tidy eval, not sure if it's a tidy eval error - any help much appreciated!

(mtcars as toy dataset)

library(rstatix)

# Test single cases - good
compare_means(mpg ~ cyl, data = mtcars)
compare_means(disp ~ cyl, data = mtcars)
compare_means(hp ~ cyl, data = mtcars)

# Trial map - fail

mtcars %>%
  map(~compare_means(.x ~ cyl, data = mtcars))

Error: Can't subset columns that don't exist.
x Column `.x` doesn't exist.

Following tidyeval guidance: https://tidyeval.tidyverse.org/dplyr.html Tried to see if quoting / unquoting was the issue, but no dice

# Abstract variables

test_data <- function(group_var) {
  quote_var <- enquo(group_var)
  data %>% compare_means(quote_var ~ cyl, data = mtcars)
}

2

There are 2 answers

0
Brent On BEST ANSWER

Actually, it may just be about formula evaluation specifially:

library(ggpubr)
library(tidyverse)

# Test data with 2 Species only
iris.subset <- iris %>% 
  filter(Species != 'virginica')
# Test single case
iris.subset %>% 
  compare_means(Sepal.Width ~ Species, data = .)

# Test direct map - doesn't work
iris.subset[1:4] %>% 
  map(~compare_means(. ~ Species, data = iris.subset))

Is it about formula evaluation? Test as.formula()

as.formula(paste0(names(iris.subset[1]), " ~ Species"))

# Pipe into test
names(iris.subset[1:4]) %>% 
  map_df(~compare_means(formula = as.formula(paste0(., " ~ Species")), data = iris.subset))

Success!!

Couldn't get an example to work with mtcars but will re-post if I do

2
Lionel Henry On

That's an NSE error, but not tidyeval. You're mapping over the vectors inside mtcars. You're not mapping over the column names of mtcars.

With inject() from the last rlang version you can do some NSE programming with non-tidyeval functions:

names(mtcars) %>% map(~ rlang::inject(compare_means(!!sym(.x) ~ cyl, data = mtcars))

Three things are going on:

  • We map over the names of the data frame.
  • We transform the name to a symbol, i.e. an R variable.
  • We inject that symbol into the formula using inject() and !!.

I have not tested the code.