R: somewhat atypical full_join() with missings in key variable

45 views Asked by At

I have two tables df1 and df2 I want to full_join() using the "id" variable as a key. Cases without id (=NA) should just be added as a row.

Does someone know, how to achieve this?

library(dplyr)

df1 <- data.frame(id = c(1, 2, 3, NA, NA, NA),
                  t1 =  c(1, 2, 3, 4,  5,  6))

df1

df2 <- data.frame(id = c(1, 2, 3, NA, NA, 4),
                  t2 =  c(1, 2, 3, 4,  NA, 5)) 

df2

df1 %>% full_join(df2, by = "id")

Expected output:

id t1 t2
1  1  1
2  2  2
3  3  3
4  NA 5
NA 4  NA   # from df1
NA 5  NA   # from df1
NA 6  NA   # from df1
NA NA  4   # from df2
NA NA NA   # from df2


1

There are 1 answers

0
Mark On

The dplyr way is to use na_matches = "never", as in Seth's comment. The equivalent way of doing this with base R is to use the incomparables parameter of merge():

merge(df1, df2, by = "id", all = TRUE, incomparables = NA)

Output:

  id t1 t2
1  1  1  1
2  2  2  2
3  3  3  3
4  4 NA  5
5 NA  4 NA
6 NA  5 NA
7 NA  6 NA
8 NA NA  4
9 NA NA NA