Tibble: operation on list columns

365 views Asked by At

I have the following tibble:

temp <- structure(list(x = list(1:10, 1:10), y = list(c(3L, 9L, 10L, 
8L, 1L), c(1L, 3L, 5L, 2L, 4L))), .Names = c("x", "y"), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -2L))


> temp
# A tibble: 2 x 2
           x         y
      <list>    <list>
1 <int [10]> <int [5]>
2 <int [10]> <int [5]>

I would like to create a new column z that is the setdiff of the elements of the lists in columns x and y, such that temp$z should output as:

> temp$z
[[1]]
[1] 2 4 5 6 7

[[2]]
[1]  6  7  8  9 10

and temp would update as:

> temp
# A tibble: 2 x 3
           x         y         z
      <list>    <list>    <list>
1 <int [10]> <int [5]> <int [5]>
2 <int [10]> <int [5]> <int [5]>

PS: A dplyr solution would be great! :-)

3

There are 3 answers

0
JasonWang On BEST ANSWER

You can use Map in mutate:

temp %>% mutate(z=Map(setdiff, x, y))
# # A tibble: 2 x 3
#            x         y         z
#       <list>    <list>    <list>
# 1 <int [10]> <int [5]> <int [5]>
# 2 <int [10]> <int [5]> <int [5]>

temp %>% mutate(z=Map(setdiff, x, y)) %>% pull(z)
# [[1]]
# [1] 2 4 5 6 7
# 
# [[2]]
# [1]  6  7  8  9 10
0
austensen On

You can use purrr::map2 within mutate.


library(dplyr)
library(purrr)

temp %>% mutate(z = map2(x, y, setdiff))

#> # A tibble: 2 x 3
#>            x         y         z
#>       <list>    <list>    <list>
#> 1 <int [10]> <int [5]> <int [5]>
#> 2 <int [10]> <int [5]> <int [5]>
0
moodymudskipper On

or just base while we're at it :)

within(temp,z<-Map(setdiff, x, y))