Reshape binary dataframe based on colnames in R

72 views Asked by At

I have a dataframe of camera-trap activity periods (binary data) that I need to reshape. Below is an example of the dataframe:

df <- data.frame(
           ID = c("A1","A2","B1","B2","C1","C2","D1","D2","E1","E2"),
            a = c(1,1,1,1,1,0,1,1,NA,1),
            b = c(1,0,1,1,1,0,1,1,1,1),
            c = c(1,0,1,1,1,1,1,1,1,1),
            d = c(1,0,1,1,1,0,1,1,1,1),
            e = c(1,1,1,1,1,0,1,1,1,1)
            )

colnames(df)[2:6] <- c("01/11/2016", "02/11/2016", "03/11/2016", 
                       "04/11/2016", "05/11/2016")

> df
   ID 01/11/2016 02/11/2016 03/11/2016 04/11/2016 05/11/2016
1  A1          1          1          1          1          1
2  A2          1          0          0          0          1
3  B1          1          1          1          1          1
4  B2          1          1          1          1          1
5  C1          1          1          1          1          1
6  C2          0          0          1          0          0
7  D1          1          1          1          1          1
8  D2          1          1          1          1          1
9  E1         NA          1          1          1          1
10 E2          1          1          1          1          1

The ID column denotes each camera. The date columns denote the days when the camera-trapping survey was conducted. Row values of 1 indicate that that particular camera, on that particular day, was active (i.e., working). Row values of 0 indicate that that particular camera, on that particular day, was inactive (i.e, not working). NAs indicate that that particular camera, on that particular day, was not yet set up or already retrieved (NAs can also occur at the end of the dataframe; not shown in this example).

I need to reshape the dataframe to be the following:

ID  Setup_date  Retrieval_date  Problem1_from   Problem1_to Problem2_from   Problem2_to
A1  01/11/2016  05/11/2016      NA              NA          NA              NA
A2  01/11/2016  05/11/2016      02/11/2016      04/11/2016  NA              NA
B1  01/11/2016  05/11/2016      NA              NA          NA              NA
B2  01/11/2016  05/11/2016      NA              NA          NA              NA
C1  01/11/2016  05/11/2016      NA              NA          NA              NA
C2  01/11/2016  05/11/2016      01/11/2016      02/11/2016  04/11/2016      05/11/2016
D1  01/11/2016  05/11/2016      NA              NA          NA              NA
D2  01/11/2016  05/11/2016      NA              NA          NA              NA
E1  02/11/2016  05/11/2016      NA              NA          NA              NA
E2  01/11/2016  05/11/2016      NA              NA          NA              NA

So that each row still represents a single camera (with the column name "ID"), together with additional columns called:

"Setup_date" – denoting the date when that particular camera was first set up.

"Retrieval_date" – denoting the date when that particular camera was retrieved.

"Problem1_from" – denoting the start date when that particular camera was coded 0.

"Problem1_to" – denoting the end date when that particular camera was coded 0.

Cameras can be coded 0 during multiple occasions, as demonstrated for camera C2. This would then result another "ProblemX_from" and "_to" as in the example above. Also note that camera E1 "Setup_date" is a day later than the rest, since this camera was coded NA on the 01/11/2016 (i.e., it was only set up on the 02/11/2016).

I'm unsure how to proceed with this task?

Appreciate any assistance. Thanks in advance.

0

There are 0 answers