How to get R to read function variables with rstatix

350 views Asked by At

I am trying to perform multiple, independent t-tests on a large data frame. When I create a function to loop over to run the tests rstatix will not read the function variables as variables and input their value.

Example data

if(!require(rstatix)){install.packages("rstatix");library('rstatix')}

set.seed(1)
df <- data.frame(
Type = sprintf("Type_%s", rep.int(1:2, times = 10)),
Read = rnorm(20))

T-test

stat.test <- df %>%
  t_test(Read ~ Type, paired = FALSE)
stat.test

Plot without statistics

ggplot(df, aes(x = Type, y = Read))  + 
      geom_boxplot(aes(fill= Type)) +
      geom_dotplot(binaxis='y', stackdir='center', dotsize=1, binwidth = 1/30)

enter image description here

Example function (works fine!)

my.function <-
function(df, var1, var2) {
    
    ggplot(df, aes_string(x = var1, y = var2))  + 
      geom_boxplot(aes_string(fill= var1)) +
      geom_dotplot(binaxis='y', stackdir='center', dotsize=1, binwidth = 1/30)
}
my.function(df, 'Type', 'Read')

enter image description here

My issue

my.function <-
function(df, var1, var2) {
    stat.test <- df %>%
      t_test(var2 ~ var1, paired = FALSE)
    
    ggplot(df, aes_string(x = var1, y = var2))  + 
      geom_boxplot(aes_string(fill= var1)) +
      geom_dotplot(binaxis='y', stackdir='center', dotsize=1, binwidth = 1/30) + 
      stat_pvalue_manual(stat.test, label = "p", y.position = 2.1)
}
my.function(df, 'Type', 'Read')

The above returns an error because rstatix thinks var1 and var2 are columns in the example data frame.

I have tried the following to get R to stop the behavior but both attempts fail.

Attempt 1

my.function <-
function(df, var1, var2) {
    stat.test <- df %>%
      t_test(eval(parse(var2)) ~ eval(parse(var1)), paired = FALSE)
    
    ggplot(df, aes_string(x = var1, y = var2))  + 
      geom_boxplot(aes_string(fill= var1)) +
      geom_dotplot(binaxis='y', stackdir='center', dotsize=1, binwidth = 1/30) + 
      stat_pvalue_manual(stat.test, label = "p", y.position = 2.1)
}
my.function(df, 'Type', 'Read')

Attempt 2

my.function <-
function(df, var1, var2) {
    stat.test <- df %>%
      t_test(eval(as.name(paste(var2))) ~ eval(as.name(paste(var1))), paired = FALSE)
    
    ggplot(df, aes_string(x = var1, y = var2))  + 
      geom_boxplot(aes_string(fill= var1)) +
      geom_dotplot(binaxis='y', stackdir='center', dotsize=1, binwidth = 1/30) + 
      stat_pvalue_manual(stat.test, label = "p", y.position = 2.1)
}
my.function(df, 'Type', 'Read')
1

There are 1 answers

0
Noah_Seagull On BEST ANSWER

I went into the t_test function to see if there would be any indicators of why my attempts to get this custom function to run would fail. I suspected the issue had something to do with the way R handles formulas and functions. After a bit of manipulation of my original script, I finally got it working.

my.function <-
function(df, var1, var2) {
    stat.test <- df %>%
      t_test(as.formula(paste(var2, '~', var1)), paired = FALSE)
    
    ggplot(df, aes_string(x = var1, y = var2))  + 
      geom_boxplot(aes_string(fill= var1)) +
      geom_dotplot(binaxis='y', stackdir='center', dotsize=1, binwidth = 1/30) + 
      stat_pvalue_manual(stat.test, label = "p", y.position = 2.1)
}
my.function(df, 'Type', 'Read')

enter image description here