I have a huge list of various tibbles that I outputted from running a for loop. The tibbles in this list are the result of using count()
and group_by_()
, and show the number of people by category a
who do and do not work at various facilities, but each element of the list has a variable number of rows and one column where the name differs.
These are three elements of my (20+ element) list:
library(plyr)
library(tidyverse)
(Hospital1 <- tibble(a = c("In.City", "In.City", "Likely Move", "Likely Move", "Possibly Move", "Unlikely Move", "Unlikely Move", NA),
Something.Hospital1 = c("Hospital1", NA, "Hospital1", NA, NA, "Hospital1", NA, NA),
n= c(sample(1:100, 8, replace = TRUE))))
(Hospital2 <- tibble(a = c("In.City", "In.City", "Likely Move", "Likely Move", "Possibly Move", "Unlikely Move", NA),
Something.Hospital2 = c("Hospital2", NA, "Hospital2", NA, NA, NA, NA),
n= c(sample(1:100, 7, replace = TRUE))))
(Hospital3 <- tibble(a = c("In.City", "Likely Move", "Unlikely Move", "Possibly Move", NA),
Something.Hospital3 = as.character(c(NA, NA, NA, NA, NA)),
n= c(sample(1:100, 5, replace = TRUE))))
#What my data actually looks like
(example.list <- list(Hospital1, Hospital2, Hospital3))
The values of n
in the 3rd column that are not next to the NA
in the second column of each list element matter most. I am so far really struggling to get these into a form that ggplot2 can use them. Because there are about 20 list elements I'm working with, I feel I need to use list structure in the code and be able to iterate or make a function if need be. I've tried to:
1) Use cbind(example.list[[1:3]])
and bind_cols([[1:3]])
, to bind them into one tibble, though both fail because of the varying number of rows (same with single brackets).
2) Graph those elements individually, though ggplot2 fails with lists.
3) Split them into individual dataframes, using list2env(example.list[1:3], envir = .GlobalEnv)
, but that gave me the error
names(x) must be a character vector of the same length as x
4) Doing a for loop left_join()
but to do so, I would need a key, which I could form using something similar to cumsum()
but for character cells. That would have to look like this
a Something.Hospital1 n match.key
<chr> <chr> <int> <dbl>
1 In.City Hospital1 40 1
2 In.City <NA> 25 2
3 Likely Move Hospital1 17 1
4 Likely Move <NA> 56 2
5 Possibly Move Hospital1 59 1
5) I tried using ddply(output[[10]], .(crosstab, n), nrow)
, but V1 isn't helpful for merging like this.
crosstab n V1
1 In.Camden 14 1
2 In.Camden 139 1
3 Likely Move 4 1
4 Likely Move 171 1
In my mind, I would like my data to look like this, with all of the possibilities listed. I could then use tidyr()
to get it into ggplot form
a Something.Hospital1 n1 Something.Hospital2 n2 Something.Hospital3 n3
<chr> <chr> <int> <chr> <int> <chr> <int>
1 In.City Hospital1 39 Hospital2 24 Hospital3 13
2 In.City <NA> 86 <NA> 82 <NA> 85
3 Likely Move Hospital1 77 Hospital2 16 Hospital3 17
4 Likely Move <NA> 4 <NA> 78 <NA> 49
5 Possibly Move Hospital1 100 Hospital2 79 Hospital3 95
6 Possibly Move <NA> 49 <NA> 31 <NA> 62
7 Unlikely Move Hospital1 82 Hospital2 3 Hospital3 96
8 Unlikely Move <NA> 27 <NA> 30 <NA> 39
9 <NA> <NA> 55 <NA> 96 <NA> 47
I don't understand lapply
and related at all, so please be clear about how to use those if you suggest them. I prefer dplyr because that's what I know best.
Thanks so much for your help
You could rename the column in each tibble to the same thing, which would open the door for binding rows and getting the dataset in the long format without needing to reshape later.
You can loop through the list using
lapply
orpurrr::map
, renaming the column in each list element. I'll usemap_dfr
(map_df
in older purrr versions) because it will usebind_rows
to bind the datasets together after looping through the list.map_dfr
has an.id
argument so we can add a column to tell the datasets apart. This may be useful here, but can be skipped.You can change the column name via
rename_at
from dplyr. Because you want to rename a single column, I'll name them all to a new name usingpaste0
.Here's the code to loop through each element of
example.list
and rename all columns that contain the word "Hospital" with "Something.Hospital".