I have a dataframe with multiple columns containing information on one diagnosis. The entries are TRUE, FALSE or NA. I create a vector which summarizes those columns as follows: If a patient was diagnosed at some time (TRUE), then TRUE, if the only valid entry is FALSE, then FALSE and if there just missings, then NA. Written text as code:
data.frame(a= c(FALSE, TRUE, NA, FALSE, TRUE, NA, FALSE, TRUE, NA),
b= c(FALSE, FALSE, FALSE, TRUE, TRUE, TRUE, NA, NA, NA),
expected= c(FALSE, TRUE, FALSE, TRUE, TRUE, TRUE, FALSE, TRUE, NA))
I need to go trough all the columns rowwise and I do so using split. Unfortunatelly, my data is big and it takes a long while. What I do at the moment is
library(magrittr)
# big example data
df <- expand.grid(c(FALSE, TRUE, NA), c(FALSE, TRUE, NA)) %>%
.[rep(1:nrow(.), 50000), ] %>%
as.data.frame() %>%
setNames(., nm= c("a", "b"))
# My approach
df$res <- df %>%
split(., 1:nrow(.)) %>%
lapply(., function(row_i){
ifelse(all(is.na(row_i)), NA,
ifelse(any(row_i, na.rm= TRUE), TRUE,
ifelse(any(!row_i, na.rm= TRUE), FALSE,
row_i)))
}) %>%
unlist()
Is there a more efficient way to solve this task?

A vectorized solution using
pmax():You can also merge all the parameters into a list to avoid the anonymous function in
do.call(). I rewrite it as a functionrowAnysto complementrowSums/rowMeansinbase.You could also use
pminto implement rowwise-all().