R dplyr - combining a list of tibbles with different number of rows into a single tibble with list_cbind()

Question

R dplyr - combining a list of tibbles with different number of rows into a single tibble with list_cbind()

383 views Asked by GeorgeM At 24 December 2022 at 17:07

After using the map function, I ended up with a list of tibbles with different number of rows. As suggested in the purr documentation (https://purrr.tidyverse.org/reference/map_dfr.html?q=map_dfr#null), I used list_cbind() to convert them into a single tibble. However, because of their different number of rows, I get an error message.

A simplified example below:

a1 <- tibble(
  name1 = c(1,2,3)
)
a2 <- tibble(
  name2 = c(1,2,3)
)
a3 <- tibble(
  name3 = c(1,2)
)
A <- list(a1, a2, a3)

list_cbind(A)

and I get the following error message:

Error in `list_cbind()`:
! Can't recycle `..1` (size 3) to match `..3` (size 2).
Run `rlang::last_error()` to see where the error occurred.`

I also tried this (size = An optional integer size to ensure that every input has the same size (i.e. number of rows)) but the same error still occurs.

list_cbind(list(a1, a2, a3), size = 2)

Any suggestions how to do it using the tidyverse (or otherwise)?

Original Q&A

There are 3 answers

**Erik De Luca** · Answer 1 · 2022-12-24T17:40:01+00:00

First calculate the dataframe with multiple rows.

Next go fill the dataframes which have less than the max number of rows with NA values, in the sapply I also extended to the case that the dataframes have more than one column.

Finally, using map I unlisted the dataframes and joined them by columns. (in case they have more than one column it would be advisable to do the operation on the rows and evaluate case by case)

dimMax = max(sapply(1:length(A), function(i) nrow(A[[i]])))

B = lapply(1:length(A), function(i) rbind(A[[i]],rep(NA, ((dimMax - nrow(A[[i]])) * ncol(A[[i]])))))

purrr::map_dfc(B,unlist)

**HoelR** · Answer 2 · 2022-12-24T17:46:19+00:00

A bit long, but it works

mget(ls(pattern = "a")) %>% 
  map_dfr(~ .x %>% 
        mutate(row = 1:nrow(.))) %>% 
  pivot_longer(-row) %>% 
  drop_na() %>% 
  pivot_wider(names_from = name, values_from = value) 


# A tibble: 3 × 4
    row name1 name2 name3
  <int> <dbl> <dbl> <dbl>
1     1     1     1     1
2     2     2     2     2
3     3     3     3    NA

**akrun** · Answer 3 · 2022-12-24T20:18:02+00:00

It requires all the datasets to have the same number of rows. We may use cbind.na from qPCR

do.call(qpcR:::cbind.na, A)
  name1 name2 name3
1     1     1     1
2     2     2     2
3     3     3    NA

If we want to use list_cbind, get the max number of rows and use that info to expand the data to include NA rows so that it is balanced and then use list_cbind

library(purrr)
library(dplyr)
mx <- max(map_int(A, nrow))
A %>% 
  map(~ .x[seq_len(mx),]) %>%
   list_cbind
# A tibble: 3 × 3
  name1 name2 name3
  <dbl> <dbl> <dbl>
1     1     1     1
2     2     2     2
3     3     3    NA

TechQA.

R dplyr - combining a list of tibbles with different number of rows into a single tibble with list_cbind()

There are 3 answers

Related Questions in R

Related Questions in DPLYR

Related Questions in PURRR

Related Questions in VCTRS

Popular Questions

Popular Tags

Trending Questions