I have a dataset with a number of date columns in excel serial date format. I've managed to convert the dates to POSIXct format using the following simple mutate
myDataSet_wrangled <- myDataSet %>%
mutate(startDate = as.POSIXct(as.numeric(startDate) * 3600 * 24, origin = "1899-12-30", tz = "GMT"))
However, when I try to refactor this as a function of the form convertDate(df, ...), I can't seem to wrap my head around how to correctly indirect the column names. Frustratingly, the following code works with one column name, but when I pass multiple column names, it fails with an error "Error in 'mutate()': ... ! object 'endDate' not found"
myDataSet <- data.frame(
startDate = c(44197.924, 44258.363, 44320.634), # dates in Excel format
endDate = c(44201.131, 44270.859, 44330.023)
)
convertXlDateToPOSIXct <- function(df, ..., epoch = "1899-12-30", timezone = "GMT") {
cols <- enquos(...)
df <- df %>%
mutate(across(!!!cols, ~ as.POSIXct(as.numeric(.x) * 3600 * 24, origin = epoch, tz = timezone)))
return(df)
}
# Call with one column
myDataSet_wrangled <- myDataSet %>%
convertXlDateToPOSIXct(startDate)
# startDate correctly converted, no error thrown
# Call with multiple columns
myDataSet_wrangled <- myDataSet %>%
convertXlDateToPOSIXct(startDate,
endDate)
# 404: endDate Not Found
I've tried various combinations of ..., enquos, ensyms, and !!!, but I think I'm fundamentally misunderstanding how name masking works in R.
The R Documentation (topic-data-mask-programming {rlang}) makes some reference to forwarding of ... arguments not requiring special syntax, and demonstrates that you can call e.g.
group_by(...).I hadn't been able to work out why this syntax wasn't working in the code above, but (with thanks to @lotus) I've realised that real problem isn't that ... isn't properly enquo'd or ensym'd, but that across wants a single argument, rather than five or six or n arguments which are forwarded when passing ...; encapsulating ... with c() provides the column names in the expected format.
Alternatively, without the enclosing c(), calling with
convertXlDateToPOSIXct(df, c(startDate, endDate))would also work correctly, although it would make more sense to use a named parameter (e.g.convertXlDateToPOSIXct <- function(df, cols, epoch = "1899-12-30", timezone = "GMT")