I need to delete the columns (from second onwards) having values different than 0 only in the rows which in the first column have specific values (e.g., sp3 and sp5). My dataset is large, but here it is a small sample of the data.

SP   id2324 id8283  id3912  id3912  id1231...
sp.1    0   2   4   1   0
sp.2    12  10  2   3   15
sp.3    0   0   23  0   4
sp.4    2   2   11  19  0
sp.5    0   0   0   0   3
sp.6    3   1   7   3   0
sp.7    0   14  1   0   12
sp.8    1   0   2   6   6

In this small example I would expect the id3912 and id1231 variables to disappear.

1 Answers

0
Ronak Shah On Best Solutions

We can first select the rows where SP is c("sp.3", "sp.5"), then select columns where there is at least one value different than 0.

cbind(df[1], df[-1][colSums(df[df$SP %in% c("sp.3", "sp.5"), -1] != 0) == 0])


#    SP id2324 id8283 id3912.1
#1 sp.1      0      2        1
#2 sp.2     12     10        3
#3 sp.3      0      0        0
#4 sp.4      2      2       19
#5 sp.5      0      0        0
#6 sp.6      3      1        3
#7 sp.7      0     14        0
#8 sp.8      1      0        6

Breaking it down step-by-step

Select rows where SP is c("sp.3", "sp.5")

df[df$SP %in% c("sp.3", "sp.5"), -1]
#  id2324 id8283 id3912 id3912.1 id1231
#3      0      0     23        0      4
#5      0      0      0        0      3

Find cells where value is not equal to 0

df[df$SP %in% c("sp.3", "sp.5"), -1] != 0
#  id2324 id8283 id3912 id3912.1 id1231
#3  FALSE  FALSE   TRUE    FALSE   TRUE
#5  FALSE  FALSE  FALSE    FALSE   TRUE

Find columns where all values are 0

colSums(df[df$SP %in% c("sp.3", "sp.5"), -1] != 0) == 0

#  id2324   id8283   id3912 id3912.1   id1231 
#    TRUE     TRUE    FALSE     TRUE    FALSE 

We then select the columns which are TRUE and cbind them with 1st column.