Contingency Table from Sparse Matrix

420 views Asked by At

I have a large sparse matrix. now I want to make contingency table of all combination of pair of columns. For example : Let's say My sparse matrix is Mat

D1   D2  D3  D4  D5  ..  Dn
1    0   1   0   0   ..  0
0    1   1   1   1   ..  1
..   ..  ..  ..  ..  ..  ..
1    0   1   0   1   ..  1

Now need to make contingency tables for all combination of Di and Dj for example Contingency table for (D1,D2), (D1,D3), (D1,D4).. (D1, Dn), (D2,D3), (D2,D4) .. (D2,Dn) .. (Dn-1 , Dn)

structure of each Contingency Table

 r1  r2
 r3  r4



#where r1 is total number of 1's in Di column 
#         r2 is total number of 1's in Di AND Dj column
#         r3 is total number of 1's in Di AND Dj column 
#         r4 is total number of 1's in Dj column

Also:

for each i in (1:n-1) {
    for each j in (i+1 : n) {
        Calculate r1,r2,r3,r4
        create contingency table for Ri and Rj
        apply fisher test on that 
    }
}

I want some fast implementation as it is taking more than 2-3 days

1

There are 1 answers

5
Sotos On BEST ANSWER

Here is one idea to get all the 2 x 2 matrices,

fun1 <- function(x,y){
 matrix(data = c(sum(m1[,x]), sum(m1[,c(x,y)]), sum(m1[,c(x,y)]), sum(m1[,y])), 
                                                               nrow = 2, ncol = 2)
 }
#where m1 is your original matrix

ind1 <- combn(1:ncol(m1),2)[1,]
ind2 <- combn(1:ncol(m1),2)[2,]
final.list <- Map(fun1, ind1, ind2)

head(final.list, 2)
#[[1]]
#     [,1] [,2]
#[1,]    3    6
#[2,]    6    3

#[[2]]
#     [,1] [,2]
#[1,]    3    6
#[2,]    6    3

DATA

dput(m1)
structure(c(0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 
1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1), .Dim = c(6L, 
6L), .Dimnames = list(NULL, c("D1", "D2", "D3", "D4", "D5", "D6"
)))

Or similarly,

fun2 <- function(x,y){
     matrix(data = c(c.sums[x], sum(c.sums[c(x,y)]), sum(c.sums[c(x,y)]), c.sums[y]),
                                                                    nrow = 2, ncol = 2)
 }

ind1 <- combn(1:ncol(m1),2)[1,]
ind2 <- combn(1:ncol(m1),2)[2,]
c.sums <- colSums(m1)

final.list2 <- Map(fun2, ind1, ind2)