I have a data frame that has NA's in every row. Some are on the left, some in the middle, and some on the right. Something like this:

a <- c(NA, NA, 1, NA)
b <- c(NA, 1,  1, NA)
c <- c(NA, NA, 1, 1)
d <- c(1, 1, NA, 1)
df <- data.frame(a, b, c, d)
df
# a  b  c  d
# NA NA NA 1
# NA 1  NA 1
# 1  1  1  NA
# NA NA 1  1

I would like to replace all the NAs that are in the middle and on the right side with 0 but keep all the NA's leading to a 1 on the left as NA. So I would like an efficient way (my data frame is large) to have this data frame:

# a  b  c  d
# NA NA NA 1
# NA 1  0  1
# 1  1  1  0
# NA NA 1  1

1 Answers

1
akrun On Best Solutions

We can use apply to loop over the rows, find the index of the first occurence of 1. Then replace the NAs from that element to the last with 0

df[] <- t(apply(df, 1, function(x) {
               i1 <- which(x == 1)[1]
               i2 <- i1:length(x)
               x[i2][is.na(x[i2])] <- 0
               x})) 

Or another option is

df[] <-  t(apply(df, 1, function(x) replace(x, 
                 cumsum(x ==1 & !is.na(x)) >= 1 & is.na(x), 0)))