How do I convert a dataset with 1 column per day to only 1 date column (days) with R

206 views Asked by At

I have a dataset with birds observations. I have one column for each day of the month (31). If the bird is seen this day, it has the "place" info (where it has been seen). Can someone help we with a code that can merge these 31 columns (days) into one date column and another second column with the "place" information? I think I can use the "dcast" function from "reshape2" package but I don't know how to use it to keep the two informations (day and place).

Here is the structure of my dataset:

    bird_data = structure(
  list(
    ID = c(
      "FB37461",
      "FA42342",
      "FA42261",
      "FB37329",
      "FA42332",
      "FH60963",
      "FB37473",
      "FB37593",
      "FA85545",
      "FC10619"
    ),
    Name = c(
      "Dekort",
      "Simon",
      "Devil",
      "Chimere",
      "Private Norman",
      "Aurresku",
      "Rombus",
      "Tan?",
      "Taiwan",
      "Bakugo"
    ),
    Lring = c("",
              "ID/RD", "", "DB", "", "YL/WT", "", "", "", "ID/DG"),
    Rring = c("",
              "DB", "", "MV/ID", "", "ID", "", "", "", "O"),
    sex = c("M", "F",
            "F", "U", "F", "F", "U", "J", "F", "J"),
    month = c(
      "October",
      "December",
      "July",
      "April",
      "November",
      "November",
      "March",
      "April",
      "August",
      "March"
    ),
    year = c(
      2016L,
      2018L,
      2015L,
      2018L,
      2016L,
      2018L,
      2015L,
      2015L,
      2016L,
      2018L
    ),
    seen = c(1L, 0L, 1L,
             1L, 1L, 0L, 0L, 1L, 1L, 1L),
    Freq = c(13L, 0L, 9L, 10L, 4L, 0L,
             0L, 7L, 5L, 14L),
    `1` = c("S", "", "A", "", "", "", "", "", "",
            "AS"),
    `2` = c("", "", "", "", "", "", "", "", "", "A"),
    `3` = c("",
            "", "", "A", "", "", "", "", "", ""),
    `4` = c("S", "", "", "A",
            "S", "", "", "", "", ""),
    `5` = c("S", "", "", "A", "", "", "",
            "", "", ""),
    `6` = c("", "", "", "", "", "", "", "", "", "A"),
    `7` = c("S", "", "", "", "", "", "", "A", "", "A"),
    `8` = c("",
            "", "A", "", "", "", "", "A", "", "S"),
    `9` = c("", "", "",
            "", "", "", "", "", "", "A"),
    `10` = c("", "", "A", "A",
             "", "", "", "A", "", ""),
    `11` = c("", "", "", "", "", "",
             "", "A", "", ""),
    `12` = c("A", "", "", "A", "", "", "",
             "", "", ""),
    `13` = c("S", "", "", "AS", "", "", "", "",
             "", "A"),
    `14` = c("", "", "AF", "", "", "", "", "A", "",
             "S"),
    `15` = c("", "", "A", "", "", "", "", "", "", ""),
    `16` = c("", "", "A", "", "S", "", "", "A", "", "S"),
    `17` = c("",
             "", "A", "A", "", "", "", "A", "F", ""),
    `18` = c("AS", "",
             "A", "", "S", "", "", "", "", ""),
    `19` = c("", "", "", "",
             "", "", "", "", "", ""),
    `20` = c("S", "", "", "A", "", "",
             "", "", "", "S"),
    `21` = c("S", "", "", "", "", "", "", "",
             "", ""),
    `22` = c("", "", "", "", "", "", "", "", "", "S"),
    `23` = c("", "", "", "", "", "", "", "", "A", ""),
    `24` = c("",
             "", "", "", "S", "", "", "", "", ""),
    `25` = c("S", "", "",
             "", "", "", "", "", "S", ""),
    `26` = c("S", "", "", "A",
             "", "", "", "", "F", ""),
    `27` = c("F", "", "", "A", "",
             "", "", "", "", "S"),
    `28` = c("S", "", "", "", "", "", "",
             "", "", ""),
    `29` = c("", "", "A", "", "", "", "", "", "",
             "S"),
    `30` = c("", "", "", "", "", "", "", "", "AF", "S"),
    `31` = c("", "", "", "", "", "", "", "", "", "")
  ),
  row.names = c(
    11419L,
    21637L,
    7186L,
    17878L,
    11678L,
    21385L,
    6290L,
    6640L,
    10785L,
    17740L
  ),
  class = "data.frame"
)

For example at row 9, we have "Taiwan" with 5 observations in August 2016. She has been seen on the 17th, 23rd, 25th, 26th, 30th. So I want to do 5 lines (5 observations) with one column for date and another one for place seen (F, A, S).

1

There are 1 answers

0
Steven On BEST ANSWER

Easiest way is simply to select the columns you want to pivot and put everything else into some other format.

bird_data %>%
  pivot_longer(cols = paste(1:31),
               names_to = "day", 
               values_to = 'location') %>% 
  filter(location != "", 
         Name == "Taiwan")

I use the paste() fx to force the integers into characters. The last filter step is just to 1. get rid of the null/empty rows and 2. display the rows for Taiwan.

Once you've pivoted it longer, you can figure out what you want to do with the excess rows for Freq and other cols. It's easy to drop other columns in the cols argument for pivot_longer().