I am making a correlation matrix and I need to have, in addition to the correlation coefficient, the p-value. One of the tools that allows me to do that is the correlation() package, however, I can't find the methods to deal with missing values, as with the cor() package.
By default cor() uses the "everything" method, however that eliminates my columns with missing data; the "pairwise.complete.obs" method, in cor(), allows me to use the complete data pairs of the variables to correlate, without losing the variable with missing data and is the same method that correlation() uses by default. However I would like to be able to choose other methods in correlation() to deal with missing data and I can't find a way to do it.
a <- data.frame(
stringsAsFactors = FALSE,
check.names = FALSE,
`A` = c(2.8,3.4,11.81,2.98,4.09,
4.68,4.8,2.89,1.47,2.42,1.71,4.61,5.57,1.43),
`B` = c(2.54,2.75,2.01,2.14,2.34,
1.82,6.9,1.91,1.12,1.4,1.21,0.9,1.63,0.95),
`C` = c(2.88,2.83,2.75,2.94,2.93,
1.98,7.58,2.1,1.1,1.59,1.23,1.68,4.11,1.14),
`D` = c(2.61,2.72,2.46,2.77,2.74,
1.77,7.23,2.21,1.58,2.75,2.05,0.44,2.42,1.09),
`E` = c(3.83,5.33,4.44,3.91,4.84,
5.56,4.14,3.9,2.72,4.5,3.07,2.21,2.91,2.96),
`F` = c(2.81,4.21,2.81,2.87,3.47,
3.44,3.92,2.86,2.25,3.06,1.98,1.35,1.87,1.86),
`G` = c(2.32,3.6,1.99,2.23,2.72,
2.58,3.12,2.28,1.81,2.28,1.36,0.86,1.67,1.44),
`H` = c(1.79,3.31,1.71,1.97,2.08,
2.18,2.89,2.64,1.29,1.85,1,0.72,1.13,0.99),
`I` = c(1.5,2.07,1.05,1.23,1.91,
2.01,2.15,1.43,1.43,1.08,0.73,0.34,0.66,0.52),
`J` = c(1.64,2.02,1.13,1.57,1.6,
1.51,2.26,2.15,1.23,1.14,0.75,0.58,0.85,0.5),
`K` = c(2.92,NA,3.2,2.45,3.43,2.24,
5.59,2.67,1.57,2.39,1.94,1.74,2.76,1.92),
`L` = c(19.72,16.98,27.77,17.71,
23.19,28.24,15.43,15.6,16.96,23.1,19.6,15.74,20.23,
24.62),
`M` = c(27.51,13.58,13.76,21.61,
19.51,16.99,NA,11.19,11.39,25.82,9.85,10.83,5.83,
5.42),
`N` = c(7.74,4.74,3.71,5.74,6.06,
4.8,NA,3.84,3.41,6.14,2.56,3.12,1.86,2.06),
`O` = c(1.63,1.66,1.09,1.49,1.66,
0.78,3.13,1.76,0.81,1.34,0.89,0.45,0.51,0.5),
`P` = c(1.56,NA,0.9,1.48,1.6,0.83,
2.59,1.59,0.88,1.3,0.66,0.41,0.5,0.47),
ParĂ¡metro = c("CO2","CO2","CO2",
"CO2","CO2","CO2","CO2","CO2",
"CO2","CO2","CO2","CO2","CO2",
"CO2")
)
as.data.frame(correlation(a[,1:16],method = "pearson")) %>% select(1,2,3) %>% pivot_wider(names_from = 1,values_from = 3)
cor(a[,1:16],method = "pearson",use = "pairwise.complete.obs")
cor(a[,1:16],method = "pearson",use = "everything")
Know how to choose the way missing values are treated with the correlation() function.