In reference to this question, I was wondering if multiple dataframes can be used based on the function. I mean, in the example in the linked question, we have discussed only about mean
,sd
. If there is a requirement to add two sample t-test
to the other two functions, we need to use another data.frame
besides the one used for mean
and sd
.
Do you have any suggestions in this scenario ?
Copying the example from the linked thread:
# Random generation
set.seed(12)
df1 <- data.frame(col1 = sample(1:100, 10, replace=FALSE),
col2 = sample(1:100, 10, replace=FALSE))
set.seed(16)
df2 <- data.frame(col3 = sample(10:90, 10, replace=FALSE),
col4 = sample(10:90, 10, replace=FALSE))
# Introducing null values
df1$col1[c(3,5,9)] <- NA
df1$col2[c(3,6)] <- NA
df2$col3[c(5,8)] <- NA
df2$col4[c(4,5,9)] <- NA
# sapply with return a value for a function
stat <- data.frame(Mean=numeric(length = length(df1)), row.names = colnames(df1))
stat[,c('Mean','Sd')] <- vapply(df1,function(x) c(mean(x,na.rm=TRUE),sd(x,na.rm=TRUE)),numeric(2))
How to inlude the function t.test(df1$col1,df2$col3)
in this ?
funs <- list(sd=sd, mean=mean)
sapply(funs, function(x) sapply(df1, x, na.rm=T))
Expected output: (something like the below in a dataframe 2x5)
mean(col1) sd(col1) mean(col3) sd(col3) t.test(col1,col3)$p.value
mean(col2) sd(col2) mean(col4) sd(col4) t.test(col2,col4)$p.value