In a dataset of 200,000/400,000 rows, with recorded variables for several id's for each minute over a period of 6 months, I want to extract data occurring within a certain time of day, say 7 am to 17 pm. I generally work in Tidyverse.
Lubridate intervals seems to be the function to use, but how do I get this to work on recurring time-spans across dates?
Example data:
id<-c("A","A","A","B","B","B")
datetime<-ymd_hms("2023-05-24 06:59:00","2023-05-24 07:00:00","2023-05-24 07:01:00","2023-05-24 06:59:00","2023-05-24 07:00:00","2023-05-24 07:01:00")
measurement<-c(4,5,7,9,2,6)
df<-data.frame(id,datetime,measurement)
I can add a column with just the time
mutate(time=format(datetime,format="%H:%M"))
But creating an interval with only time, not date, does not seem to work
valid_time<-interval(hm('07:00'),hm('17:00'))
My aim was to then create a column with TRUE/FALSE of whether time point is within interval, and then filter on that.
mutate(valid=time %within% valid_time)
Ideally, I could just make a one-step filtering on datetime, something like the below (not working)
filter(time %in% 0700:1700)
or even better (also not working)
filter(hm(datetime) %in% 0700:1700)
Much thanks in advance!
Here is a
data.tableapproach usingas.ITime(), which represents the number of seconds in the day.