I am working with a teacher absences file. I need to flag any instance of 5+ consecutive absences.
I have a dataframe like the one below. How do I split this data into new dataframes by teacher name?
absences <- data.frame(
staffid = c("123","456","789","101","121", "123", "123", "123", "123"),
name = c("Kara","Barbie","Sam","Jane","Chris", "Kara", "Kara", "Kara", "Kara"),
date = c(as.Date("2022-08-31"), as.Date("2022-09-01"), as.Date("2022-09-01"),
as.Date("2022-09-02"), as.Date("2022-09-07"), as.Date("2022-09-01"),
as.Date("2022-09-02"), as.Date("2022-09-06"), as.Date("2022-09-07")),
schoolday = c(1, 2, 2, 3, 5, 2, 3, 4, 5))
I tried the code below:
absences_new <- absences %>% nest(.by = name)
which gave me
Sorry, I am new to R and do not know what the mini-data frames in the output are called.
I also tried:
X <- split(absences, absences$name)
which gave me a result more like what I'm looking for, but it is in a format I don't know how to work with.
I also tried:
teachername <- (unique(absences$name))
splitdata <- split(absences, teachername)
but this gave me an error of "data length is not a multiple of split variable."
What I want for my output is something like what this would make:
Kara <- data.frame(
staffid = c("123","123", "123", "123", "123"),
name= c("Kara", "Kara", "Kara", "Kara", "Kara"),
date = c(as.Date("2022-08-31"), as.Date("2022-09-01"),
as.Date("2022-09-02"), as.Date("2022-09-06"),
as.Date("2022-09-07")),
schoolday = c(1, 2, 3, 4, 5))
Sam <- data.frame(
staffid = c("789"),
name= c("Sam"),
date = c(as.Date("2022-09-01")),
schoolday = c(2))
Then, my plan is to take these mini data frames and scan for any consecutive days.
Thank you!!