I recently came across dplyr and - as a newbie - like it very much. Hence, I try to convert some of my base-R code into dplyr-code.
Working with air traffic control data, I am struggling with coercing timestamps using lubridate and as.POSIXlt to parse timestamps embedded in a mutate_each() call. I need the POSIXlt format as I have to work with local times (at different locations) later on. Reading in the data delivers a data frame of characters. The following is a simplistic example:
ICAO_ADEP <- c("DGAA","ZSPD","UAAA","RJTT","KJFK","WSSS")
MVT_TIME_UTC <- c("01-Jan-2013 04:02:24", NA,"01-Jan-2013 04:08:18", NA,"01-Jan-2013 04:17:11","01-Jan-2013 04:21:52")
flights <- data.frame(ICAO_ADEP, MVT_TIME_UTC)
The function I wrote reads as follows:
make_POSIXlt <- function(vec, tz="UTC"){
vec <- parse_date_time(vec, orders="dmy_hms", tz=tz)
vec <- as.POSIXlt(vec, tz=tz)
}
The code works fine when executed with a single column:
flights$MVT_TIME_UTC <- make_POSIXlt(flights$MVT_TIME_UTC)
If I run the following dplyr code the function fails:
flights$BLOCK_TIME_UTC <- mutate_each(flights, funs(make_POSIXlt(.)), MVT_TIME_UTC)
Error: wrong result size (9), expected 6 or 1
The issue should be linked with the as.POSIXlt call. If this line is commented out the code works within mutate_each and coerces the timestamp into POSIXct.
Any idea/help on what is wrong? Obviously, my data has several timestamps that I would like to coerce with mutate_each (or any other suitable dplyr function) ...
Revisiting my question about 4 years later, I realised that I forgot to mark it as answered. However, this also gives me the chance to document how this (relatively) simple type coercion can (meanwhile) elegantly solved with
dplyr
andlubridate
.Key lesson learned:
lubridate
package.For the example from above
will coerce the timestamps in MVT_TIME_UTC. Check the documentation on lubridate for other parsers and/or how to handle local time zones.