apply list of functions on multiple dataframes with different number of inputs

112 views Asked by At

In reference to this question, I was wondering if multiple dataframes can be used based on the function. I mean, in the example in the linked question, we have discussed only about mean,sd. If there is a requirement to add two sample t-test to the other two functions, we need to use another data.frame besides the one used for mean and sd.

Do you have any suggestions in this scenario ?

Copying the example from the linked thread:

# Random generation
set.seed(12)
df1 <- data.frame(col1 = sample(1:100, 10, replace=FALSE), 
                 col2 = sample(1:100, 10, replace=FALSE))
set.seed(16)
    df2 <- data.frame(col3 = sample(10:90, 10, replace=FALSE), 
                     col4 = sample(10:90, 10, replace=FALSE))

# Introducing null values
df1$col1[c(3,5,9)] <- NA
df1$col2[c(3,6)] <- NA
df2$col3[c(5,8)] <- NA
df2$col4[c(4,5,9)] <- NA

# sapply with return a value for a function
stat <- data.frame(Mean=numeric(length = length(df1)), row.names = colnames(df1))
stat[,c('Mean','Sd')] <- vapply(df1,function(x) c(mean(x,na.rm=TRUE),sd(x,na.rm=TRUE)),numeric(2))

How to inlude the function t.test(df1$col1,df2$col3) in this ?

funs <- list(sd=sd, mean=mean)
sapply(funs, function(x) sapply(df1, x, na.rm=T))

Expected output: (something like the below in a dataframe 2x5)

mean(col1) sd(col1) mean(col3) sd(col3) t.test(col1,col3)$p.value
mean(col2) sd(col2) mean(col4) sd(col4) t.test(col2,col4)$p.value
0

There are 0 answers