Kruskal Wallis Test and subsetting

Question

Kruskal Wallis Test and subsetting

2.3k views Asked by Greeny At 06 September 2017 at 11:30

Are you please able to assist in performing a Krustal Wallis test using a subset of my data? I would like to be able to test for differences in "N" between "Producers".

names(Isotope.Data)
[1] "Species"         "Name"            "Group"           "Simple_Group"       "Trophic_Group"  
[6] "Sample"          "N"               "C"

In my csv.file I have a column "Trophic Group" which separates Consumers and Producers.

table(Isotope.Data$Trophic_Group)

Consumer Producers  
    61         18

Under the column heading Simple_Group, I have three Producers - Rhodophyta, Seagrass and Phaeophyceae

table(Isotope.Data$Simple_Group)

 Abalone  Loliginidae      Octopus Phaeophyceae   Rhodophyta     Seagrass      Teleost 
      24            2           12            6            9            3           20 
Tunicate 
       3

I have tried numerous things, but I get various error messages. Would anyone be able to improve on the following code?

kruskal.test(C ~ Simple_Group, data = Isotope.Data, subset = Isotope.Data$Trophic_Group = "Producers")

P.S. I have created a separate CSV.file which only includes Primary Producers. However a subsequent Dunn-test of multiple comparisons, used to determine which levels differed from each other provides different significance levels to those which includes both Consumers and Producers.

Original Q&A

There are 2 answers

**RadRel** · Answer 1 · 2022-11-08T22:37:25+00:00

RadRel On 08 November 2022 at 22:37

You can also use the map() function from the package purrr to apply function in each group once splited

library(purrr)
test <- df %>% group_split(phase) %>% map(~kruskal.test(.,val ~ distance))
test

**maycca** · Answer 2 · 2018-05-04T21:01:37+00:00

Will maybe this answer be helpful? Based on @user295691 answer:

Kruskal-Wallis test: create lapply function to subset data.frame?

Here you identify individual groups what you want to test differences between, and use split, to correctly define subsetting of your data frame.

Dummy example:

# create data
val<-runif(60, min = 0, max = 100)
distance<-floor(runif(60, min=1, max=3))
phase<-rep(c("a", "b", "c"), 20)

df<-data.frame(val, distance, phase)

# get unique groups
ii<-unique(df$phase)

# run Kruskal test, specify the subset
kruskal.test(df$val ~df$distance,
             subset = phase == "c")

And now apply the kruskal.test to each group using split:

lapply(split(df, df$phase), function(d) { kruskal.test(val ~ distance, data=d) })

or create a function:

lapply(ii, function(i) { kruskal.test(df$val ~ df$distance, subset=df$phase==i )})

Both produces test results for each group:

[[1]]

    Kruskal-Wallis rank sum test

data:  df$val by df$distance
Kruskal-Wallis chi-squared = 0.14881, df = 1, p-value = 0.6997


[[2]]

    Kruskal-Wallis rank sum test

data:  df$val by df$distance
Kruskal-Wallis chi-squared = 0.11688, df = 1, p-value = 0.7324


[[3]]

    Kruskal-Wallis rank sum test

data:  df$val by df$distance
Kruskal-Wallis chi-squared = 0.0059524, df = 1, p-value = 0.9385

Or just get the p-values (notice the addition of $p.value after the kruskal.test):

lapply(ii, function(i) { 
  kruskal.test(df$val ~ df$distance, 
               subset=df$phase==i )$p.value
}
  )

TechQA.

Kruskal Wallis Test and subsetting

There are 2 answers

Related Questions in R

Related Questions in KRUSKAL-WALLIS

Popular Questions

Popular Tags

Trending Questions