As a follow up on a recent SO question (see here) I am wondering how to perform multiple t.tests in R
with weighted data (package srvyr
). I cant make it run and would be happy if anyone could help me here. I added a random sample in the code below.
Many thanks!
#create data
surveydata <- as.data.frame(replicate(1,sample(1:5,1000,rep=TRUE)))
colnames(surveydata)[1] <- "q1"
surveydata$q2 <- sample(6, size = nrow(surveydata), replace = TRUE)
surveydata$q3 <- sample(6, size = nrow(surveydata), replace = TRUE)
surveydata$q4 <- sample(6, size = nrow(surveydata), replace = TRUE)
surveydata$group <- c(1,2)
#replace all value "6" wir NA
surveydata[surveydata == 6] <- NA
#add NAs to group 1 in q1
surveydata$q1[which(surveydata$q1==1 & surveydata$group==1)] = NA
surveydata$q1[which(surveydata$q1==2 & surveydata$group==1)] = NA
surveydata$q1[which(surveydata$q1==3 & surveydata$group==1)] = NA
surveydata$q1[which(surveydata$q1==4 & surveydata$group==1)] = NA
surveydata$q1[which(surveydata$q1==5 & surveydata$group==1)] = NA
#add weights
surveydata$weights <- round(runif(nrow(surveydata), min=0.2, max=1.5), 3)
#create vector for relevant questions
rquest <- names(surveydata)[1:4]
# create survey design
library(srvyr)
surveydesign <- surveydata %>%
as_survey_design(strata = group, weights = weights, variables = c("group", all_of(rquest)))
# perform multiple t.test (doesn't work yet)
outcome <- do.call(rbind, lapply(names(surveydesign$variables)[-1], function(i) {
tryCatch({
test <- t.test(as.formula(paste(i, "~ survey")), data = surveydesign)
data.frame(question = i,
group1 = test$estimate[1],
group2 = test$estimate[2],
difference = diff(test$estimate),
p_value = test$p.value, row.names = 1)
}, error = function(e) {
data.frame(question = i,
group1 = NA,
group2 = NA,
difference = NA,
p_value = NA, row.names = 1)
})
}))
As I understand it you have a series of question columns in the example q1 to q4. You've used
srvyr
to generate aweights
column. It is possible in our data that for a particular question one entire group maybe allNA
and you'd like to generate results into a df even when that is true. You want aweighted Student's t-test
making use of theweights
column not a simple t-test. The only function I found that provides that isweights::wtd.t.test
which doesn't offer a formula interface but wants to be fed vectors.In order of steps taken:
NA
s by variable,pull
s the vectors forx
,y
,weightx
,weighty
, runs the test, and extracts the info you want into a df row.lapply
to apply it column by column (notice it handles the case inq2
wheregroup == 1
is allNA
.do.call
andrbind
to make the df you desireYour data (without showing all the gyrations to create it and heading the first 200 rows)