Loop through variables to produce balance table

186 views Asked by At

I am creating a balance test table using rstatix. I can produce the outcome I want for each variable, but cannot get to loop over a number of variables to produce a table to my linking in one go.

require(dplyr)
require(rstatix)
data <- data.frame(group=rep(c(1,2),5), v1=rnorm(10),v2=rnorm(10))

data %>%
  t_test(v1 ~ group,detailed = TRUE) %>%
  adjust_pvalue() %>%
  add_significance(cutpoints = c(0, 1e-04, 0.001, 0.01, 0.05, 1),
                   symbols = c("****", "***", "**", "*", "ns")) %>%
  select(c(".y.","estimate1","estimate2","statistic","p.adj","p.adj.signif")) %>%
  dplyr::rename(variable = .y.,
                'training' = estimate1,
                'test' = estimate2,
                't-test'=statistic,
                p=p.adj,
                sl=p.adj.signif)

This fails:

vars <- c("V1", "V2")

bt <- character(0)
for(i in 1:length(vars)){
bt_temp <- data %>%
  t_test(vars[i] ~ group, detailed = TRUE) %>%
  adjust_pvalue() %>%
  add_significance(cutpoints = c(0, 1e-04, 0.001, 0.01, 0.05, 1),
                   symbols = c("****", "***", "**", "*", "ns")) %>%
  select(c(".y.","estimate1","estimate2","statistic","p.adj","p.adj.signif")) %>%
  dplyr::rename(variable = .y.,
                'training' = estimate1,
                'test' = estimate2,
                't-test'=statistic,
                p=p.adj,
                sl=p.adj.signif)

bt <- rbind(bt, bt_temp)

1

There are 1 answers

0
Michael Dewar On

The key is to use paste0 to piece together a string that you can then convert into a formula. In the solution below, I have converted your for loop into a purer-style map.

require(dplyr)
require(rstatix)
data <- data.frame(group=rep(c(1,2),5), v1=rnorm(10),v2=rnorm(10))  
vars <- c("v1", "v2")
    
my_new_function <- function(var){
    data %>% 
        t_test(as.formula(paste0(var, " ~ group")), detail = TRUE) %>% 
        adjust_pvalue() %>%
        add_significance(cutpoints = c(0, 1e-04, 0.001, 0.01, 0.05, 1),
                         symbols = c("****", "***", "**", "*", "ns")) %>%
        select(c(".y.","estimate1","estimate2","statistic","p.adj","p.adj.signif")) %>%
        dplyr::rename(variable = .y.,
                      'training' = estimate1,
                      'test' = estimate2,
                      't-test'=statistic,
                      p=p.adj,
                      sl=p.adj.signif)
}

vars %>% purrr::map(my_new_function) %>% bind_rows

You had an unhelpful typo in vars, mixing capital and lower case "v/V". If you really want to use a for loop, you need to add a missing closing }. Also, create

bt_temp <- vector("list", length(vars))

before your loop. Inside the loop, assign the result to bt_temp[[i]]. Bind the rows once after the loop. If you try to grow bt at each iteration, then your loop will be slow when there are many iterations.