calculate totals in each column and then run a fishers test in R

Question

calculate totals in each column and then run a fishers test in R

64 views Asked by tacrolimus At 15 October 2020 at 20:33

Data:

variant disease control total
A1         1      53    54
A2         6      2     8
A3         15     37    52
A4         0      53    53
A5         65     4     69
A6         4      5     9
A7         3      34    37

I would like to add a row at the bottom with column totals for the disease and control ones and then run a fishers per row adding another column with p-values from the test.

Desired outcome (p-values made up):

variant disease control total p-value
A1         1      53    54    0.001
A2         6      2     8     0.6921
A3         15     37    52    1
A4         0      53    53    0.98
A5         65     4     69    0.68
A6         4      5     9     0.63
A7         3      34    37    0.832
C_total    94     188

I've tried:

rbind(df, colSums(df[,2:3]), fill=TRUE)

But this give me all the column totals in the final two columns

Not sure about the Fishers yet but imagine some form of apply function using per row and per total to create a 2x2 table.

Many thanks

Original Q&A

There are 2 answers

r2evans On 15 October 2020 at 20:50

For the first of your questions:

rbind(df, rbind(colSums(df[,2:3])), fill = TRUE)[ (.N == seq_len(.N)), variant := "Total"][]
#    variant disease control total p-value
# 1:      A1       1      53    54  0.0010
# 2:      A2       6       2     8  0.6921
# 3:      A3      15      37    52  1.0000
# 4:      A4       0      53    53  0.9800
# 5:      A5      65       4    69  0.6800
# 6:      A6       4       5     9  0.6300
# 7:      A7       3      34    37  0.8320
# 8:   Total      94     188    NA      NA

**tmfmnk** · Accepted Answer · 2020-10-15T20:52:09+00:00

One dplyr and tibble solution could be:

df %>%
 add_row(variant = "Total", !!!colSums(df[-1])) %>%
 rowwise() %>%
 mutate(p_value = chisq.test(c_across(c(disease, control)), p = c(0.5, 0.5))$p.value)

  variant disease control total  p_value
  <chr>     <dbl>   <dbl> <dbl>    <dbl>
1 A1            1      53    54 1.48e-12
2 A2            6       2     8 1.57e- 1
3 A3           15      37    52 2.28e- 3
4 A4            0      53    53 3.34e-13
5 A5           65       4    69 2.08e-13
6 A6            4       5     9 7.39e- 1
7 A7            3      34    37 3.46e- 7
8 Total        94     188   282 2.17e- 8

And as I suppose you try to compare whether the count of individuals between the two groups is the same, a chi-square goodness of fit test could be used.

TechQA.

calculate totals in each column and then run a fishers test in R

There are 2 answers

Related Questions in R

Related Questions in DATA.TABLE

Related Questions in MULTIPLE-COLUMNS

Related Questions in STATISTICAL-TEST

Popular Questions

Popular Tags

Trending Questions