dunn's test to loop over columns of a data-frame

724 views Asked by At

I am trying to perform Dunn's test for Iris data. I want to loop over 4 columns and perform Dunn's test for each column on different species. However, when I try to get the column name it does not work. Can anybody tell me why?

library(rstatix)
data<-iris
for (i in seq(1:4)) {
  a<-colnames(data)
  colname1 <-as.character(a[5])
  colname2 <-as.character(a[i])
  dtest<-data %>% 
   dunn_test( get(colname2) ~ get(colname1), p.adjust.method = "BH") 
  print(dtest)
  print(i)
}
2

There are 2 answers

0
jay.sf On BEST ANSWER

dunn_test wants a formula and you attempted to provide data or a mixture of both. You could patch your for loop like this:

library("rstatix")
data <- iris
for (i in seq(1:4)) {
  a <- colnames(data)
  dtest <- dunn_test(data, as.formula(paste(a[i], a[5], sep="~")), 
                     p.adjust.method="BH")
  print(dtest)
  print(i)
}
# # A tibble: 3 x 9
#   .y.    group1  group2    n1    n2 statistic        p    p.adj p.adj.signif
# * <chr>  <chr>   <chr>  <int> <int>     <dbl>    <dbl>    <dbl> <chr>       
# 1 Sepal~ setosa  versi~    50    50      6.11 1.02e- 9 1.53e- 9 ****        
# 2 Sepal~ setosa  virgi~    50    50      9.74 2.00e-22 6.00e-22 ****        
# 3 Sepal~ versic~ virgi~    50    50      3.64 2.77e- 4 2.77e- 4 ***         
# [1] 1
# [...]

Another way is to use reformulate and Vectorize it, as well as the dunn_test function.

dunn_testv <- Vectorize(dunn_test, vectorize.args="formula", SIMPLIFY=F)
reformulatev <- Vectorize(reformulate, vectorize.args="response")

res <- dunn_testv(iris, reformulatev("Species", names(iris)[1:4]), p.adjust.method="BH")
res
# $Sepal.Length
# # A tibble: 3 x 9
#   .y.    group1  group2    n1    n2 statistic        p    p.adj p.adj.signif
# * <chr>  <chr>   <chr>  <int> <int>     <dbl>    <dbl>    <dbl> <chr>       
# 1 Sepal~ setosa  versi~    50    50      6.11 1.02e- 9 1.53e- 9 ****        
# 2 Sepal~ setosa  virgi~    50    50      9.74 2.00e-22 6.00e-22 ****        
# 3 Sepal~ versic~ virgi~    50    50      3.64 2.77e- 4 2.77e- 4 ***
# [...]
0
Ronak Shah On

You can use lapply to iterate over column names and with reformulate create the formula object. Using iris dataset you can do :

colname1 <- names(iris)[5]
colname2 <- names(iris)[1:4]

data <- lapply(colname2, function(x) {
           rstatix::dunn_test(iris, reformulate(colname1, x),  
                              p.adjust.method = "BH")
         })
data
#[[1]]
# A tibble: 3 x 9
#  .y.          group1     group2        n1    n2 statistic        p    p.adj p.adj.signif
#* <chr>        <chr>      <chr>      <int> <int>     <dbl>    <dbl>    <dbl> <chr>       
#1 Sepal.Length setosa     versicolor    50    50      6.11 1.02e- 9 1.53e- 9 ****        
#2 Sepal.Length setosa     virginica     50    50      9.74 2.00e-22 6.00e-22 ****        
#3 Sepal.Length versicolor virginica     50    50      3.64 2.77e- 4 2.77e- 4 ***         

#[[2]]
# A tibble: 3 x 9
#  .y.         group1     group2        n1    n2 statistic        p    p.adj p.adj.signif
#* <chr>       <chr>      <chr>      <int> <int>     <dbl>    <dbl>    <dbl> <chr>       
#1 Sepal.Width setosa     versicolor    50    50     -7.79 6.82e-15 2.05e-14 ****        
#2 Sepal.Width setosa     virginica     50    50     -5.37 7.68e- 8 1.15e- 7 ****        
#3 Sepal.Width versicolor virginica     50    50      2.41 1.58e- 2 1.58e- 2 * 
#...
#...