I want to perform chisq.test()
on each level of the categorical variable.
Currently, I have managed to do it on each categorical variable using below code.
# Random generation of values for categorical data
set.seed(12)
x <- data.frame(col1 = sample( LETTERS[1:4], 100, replace=TRUE ),
col2 = sample( LETTERS[3:6], 100, replace=TRUE ),
col3 = sample( LETTERS[2:5], 100, replace=TRUE ),
out = sample(c(1,2),100, replace=TRUE))
# performing chisq.test
pval <- as.data.frame(sapply(c(1:3),function(i)chisq.test(x[,i],x[,'out'])$p.value ))
#output
p.value
1 0.33019256
2 0.08523487
3 0.79403367
I am interested to compare the levels at different outcomes.
# for col1 levels different outcomes
table(x$col1,x$out)
#output
1 2
A 8 12
B 18 10
C 12 11
D 18 11
For example, to compare level B in col1
for different outcomes 1,2 in out
.
I would like to know how can this be extended(or in another smart way) to each level of a categorical variable ?
# Expected output
p.value
col1.A *****
col1.B *****
col1.C *****
.
.
.
col3.E *****
Thanks for your attention.
This is how you would do it if you wanted to do a Chi-squared test for given probabilities (with
p = rep(0.5, 2)
).I've broken this down to make it easier to understand:
Alternatively, if what you want is actually A vs not A, B vs not B, etc., you could substitute the definition of
getP
with: