How can I compare two character vectors with escape characters in R?

1k views Asked by At

I have two lists that I'm getting from an API. I need to compare the two lists in R to determine which items are present in both lists. I had hoped to do this with the intersect() command, but it did not work. Upon further inspection, I noticed that each list was actually a single vector comprising multiple items separated by commas and escape characters. Is it possible to transform these vectors into multi-item lists so that I can compare lists? Here is some example code:

What I'd like:

 > intersect(x,y)
 [[1]]
 [1] "c"

What I'm seeing instead:

 > intersect(x,y)
 list()

 > as.character(x)
 c(\"a\", \"b\", \"c\")

 > as.character(x)
 c(\"c\", \"d\", \"e\")

What's going on here? How do I compare x and y? Is there a way to transform these vectors into lists so that I can use the intersect() command?

edit: refined example and clarified data source

2

There are 2 answers

0
Aaron left Stack Overflow On

I'm still guessing here, as you haven't responded to my questions, but the only way I see for you to be getting output like that is if x and y are lists with the first element actually containing the R code that you would use to create the vector you want, like this.

x <- list('c("a", "b", "c")')
y <- list('c("c", "d", "e")')
intersect(x, y)
## list()
as.character(x)
## [1] "c(\"a\", \"b\", \"c\")"
as.character(y)
## [1] "c(\"c\", \"d\", \"e\")"

If so, what you need to do is to evaluate these expressions, and then you'll have the vectors that you think you have.

xx <- eval(parse(text=x[[1]]))
yy <- eval(parse(text=y[[1]]))
xx
## [1] "a" "b" "c"
yy
## [1] "c" "d" "e"
intersect(xx, yy)

Ryan Runge suggests that "Having extra quotes like this can happen more often as data is shared between different languages or softwares. So it could be an unintended effect of how the API is being accessed." (Thanks!)

This doesn't however, work with the explanation you gave. More information is needed!

1
user3786999 On

Thanks for the advice, everyone. I was able solve this problem like this:

 intersect(as.list(as.character(x[[1]])),as.list(as.character(y[[1]])))

I don't really understand why putting the term [[1]] after each list name solves the issue, but it seems to nonetheless.