I am trying to convert a Tibble to a parameter list for a function call. The reason I am doing this is because I want to create a simple file specification Tibble for reading in multiple fixed width files with varying columns. This way I only need to specify what columns are in a file using pull and select and then I can automatically have the file loaded and parsed. However, I am running into problems using the cols object to specify column formats.
For this example lets assume I have a Tibble of the format:
> (filespec <- tibble(ID = c("Title", "Date", "ATTR"), Length = c(23, 8, 6), Type = c("col_character()", "col_date()", "col_factor(levels=c(123456,654321)")))
# A tibble: 3 x 3
ID Length Type
<chr> <dbl> <chr>
1 Title 23 col_character()
2 Date 8 col_date()
3 ATTR 6 col_factor(levels=c(123456,654321)
I want to end up with a cols object of the format:
> (cols(Title = col_character(), Date = col_date(), ATTR=col_factor(levels=c(123456,654321))))
cols(
Title = col_character(),
Date = col_date(format = ""),
ATTR = col_factor(levels = c(123456, 654321), ordered = FALSE)
)
From other questions I have read I know this can be done with do.call. But I can not figure out how to convert the columns ID and Type to a cols object in an automated manner. Here is an example of what I tried...
> do.call(cols, select(filespec,ID, Type))
Error in switch(x, `_` = , `-` = col_skip(), `?` = col_guess(), c = col_character(), :
EXPR must be a length 1 vector
I am assuming the select needs to be wrapped with another function that performs the row to parameter mapping, how is this done?
As discussed in the comments, I fundamentally prefer Joran’s approach. In fact, whenever you find yourself storing code expressions in character strings, this should set off alarm bells: it’s an anti-pattern known as stringly typed code (a riff on, and quite the opposite of, strongly typed code). Unfortunately R is quite full of stringly typed code.
That said, your use-case (file-based configuration) is in itself a good idea. I would consider storing the information in a different format than R code fragments. But, well, it does work. So let’s explore why your code doesn’t work.
The first problem is this: you pass a tibble to
do.call
. Tibbles are lists of columns, sodo.call
allows this. However, internally your call is transformed to something equivalent to:— But this isn’t the code we want at all!
We need to fix two things here:
We need to use the
Type
column as argument values, and theID
column as argument names. We can do this by creating a new list that hasID
as names andType
as values:setNames(Type, ID)
.cols
does not know what to do with character string arguments. It needs column specifications — objects of typeCollector
.Put differently, it’s a huge difference whether you write
"col_date()"
orcol_date()
.To fix this, we need to do something fairly complex: we nee to parse the
Type
column as R code, and we need to evaluate the resulting parsed expressions. R provides two handy functions (parse
andeval
, respectively) to accomplish this. But don’t let the existence of two easy functions fool you: it’s an incredibly complex operation. R essentially needs to run a full parser and interpreter on your code fragments. And it gets even hairier if the code isn’t what you expect. For instance, the text might contain the codeunlink('/', recursive = TRUE)
instead ofcol_date()
. R would then happily erase your hard drive.This is just one of the reasons why
parse
/eval
is complex and generally avoided. Other reasons include: what happens if there’s a parse error in the code (in fact, your code does contain a missing closing parenthesis …)?But here we go. Now that we have all the pieces together, we can join them relatively easily:
Execute this code piece by piece to see what it does and convince yourself that it’s working correctly.