Create new column depending on multiple other column character strings in R

64 views Asked by At

I'm working on food consumption among mother-infant dyads and my data shows who is in proximity to my individual of interest when the eating behaviour is recorded. The data structure looks like this (very simplified):

Individual Food Consumed In.Contact In.2m In.5m DyadID
Ap A Aap Re 1
Ap B 1
Ap A Aap Re 1
Re C Red Aap 2
Aap A Ap Red 1
Red C Re Aap 2
Red A Aap Ap 2

In here, Ap-Aap and Re-Red are two dyads (infant-mother). Each dyad has a DyadID number to link the two individuals together. I want R to be able to recognize if Ap or Aap (and the same for the following Dyad Re-Red) is in proximity to the other when they eat, and have another binary column where 1 = In proximity (appears in the cells 'In Contact', 'In 2m' or 'In 5m') and 0 = Not in proximity :

Individual Food Consumed In.Contact In.2m In.5m DyadID Dyad.Proximity
Ap A Aap Re 1 1
Ap B 1 0
Ap A Aap Re 1 1
Re C Red Aap 2 1
Aap A Ap Red 1 1
Red C Re Aap 2 1
Red A Aap Ap 2 0

My real data actually has a lot of different proximity distance columns, so I need a way that will help me avoid having to state each column every time. I also have 12 different groups of dyads (compared to the 2 in this exemple), and the only methods I found to allow me to do this (which were all unsuccessful) would mean I would have to do everything again for each Dyad.

As of now, I tried using the 'mutate' function:

data1 <- data %>% 
    mutate(Dyad.Proximity = ifelse(Individual == "Ap" & 
                          find(c_across(In.Contact:In.5m) = "Aap"),
                       "1", "0"))

I've also found this alternative:

data1 <- data %>% mutate(Dyad.Proximity = c("0", "1")[(find(across(In.Contact:In.5m)) == "Aap" &
                                 Individual == "Ap")])

There is a syntaxe error in the first one and the second one gives me this error message:

'Error in across(): ! Must be used inside dplyr verbs.'

As I was saying, these methods (once I figure out what is wrong in my syntaxe) are problematic because they do not allow for me to look at every dyad at the same time, and I would need to repeat this operation for all my 24 individuals.

If feel like there should be an easy way to do this, but I simply cannot find it. Can anyone please help me?

Thank you!

1

There are 1 answers

8
langtang On

If you set up a small dyad dictionary, like this:

dyad_dict = list(c("Ap","Aap"), c("Re", "Red"))

then you can use data.table like this:

f <- function(ind, pcols,dyad, dyad_dict) {
  1*any(setdiff(dyad_dict[[dyad]],ind) %in% pcols)
}
df[, Dyad.Proximity:=f(Individual,c(In.Contact,In.2m, In.5m), DyadID, dyad_dict), by=1:nrow(df)]

Output:

   Individual Food Consumed In.Contact  In.2m  In.5m DyadID Dyad.Proximity
       <char>        <char>     <char> <char> <char>  <int>          <num>
1:         Ap             A               Aap     Re      1              1
2:         Ap             B                               1              0
3:         Ap             A               Aap     Re      1              1
4:         Re             C        Red    Aap             2              1
5:        Aap             A         Ap    Red             1              1
6:        Red             C         Re    Aap             2              1
7:        Red             A               Aap     Ap      2              0