My current data structure is like this.
age_group age_division total_estimate zipcode single_age
<chr> <dbl> <dbl> <dbl> <list>
1 0-4 2 10 51557 list(C(0, 0, 1, 1, 2, 2, 3, 3, 4, 4))
2 0-4 2 10 51557 list(C(0, 0, 1, 1, 2, 2, 3, 3, 4, 4))
3 0-4 2 10 51557 list(C(0, 0, 1, 1, 2, 2, 3, 3, 4, 4))
4 0-4 2 10 51557 list(C(0, 0, 1, 1, 2, 2, 3, 3, 4, 4))
5 0-4 2 10 51557 list(C(0, 0, 1, 1, 2, 2, 3, 3, 4, 4))
6 0-4 2 10 51557 list(C(0, 0, 1, 1, 2, 2, 3, 3, 4, 4))
7 0-4 2 10 51557 list(C(0, 0, 1, 1, 2, 2, 3, 3, 4, 4))
8 0-4 2 10 51557 list(C(0, 0, 1, 1, 2, 2, 3, 3, 4, 4))
9 0-4 2 10 51557 list(C(0, 0, 1, 1, 2, 2, 3, 3, 4, 4))
10 0-4 2 10 51557 list(C(0, 0, 1, 1, 2, 2, 3, 3, 4, 4))
11 0-4 3 15 51558 list(C(0, 0 ,0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4))
12 0-4 3 15 51558 list(C(0, 0 ,0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4))
13 0-4 3 15 51558 list(C(0, 0 ,0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4))
14 0-4 3 15 51558 list(C(0, 0 ,0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4))
15 0-4 3 15 51558 list(C(0, 0 ,0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4))
16 0-4 3 15 51558 list(C(0, 0 ,0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4))
17 0-4 3 15 51558 list(C(0, 0 ,0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4))
18 0-4 3 15 51558 list(C(0, 0 ,0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4))
19 0-4 3 15 51558 list(C(0, 0 ,0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4))
20 0-4 3 15 51558 list(C(0, 0 ,0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4))
21 0-4 3 15 51558 list(C(0, 0 ,0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4))
22 0-4 3 15 51558 list(C(0, 0 ,0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4))
23 0-4 3 15 51558 list(C(0, 0 ,0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4))
24 0-4 3 15 51558 list(C(0, 0 ,0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4))
25 0-4 3 15 51558 list(C(0, 0 ,0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4))
26 85-90 1 8 51559 list(C(0, 1, 2, 3, 4))
27 85-90 1 8 51559 list(C(0, 1, 2, 3, 4))
28 85-90 1 8 51559 list(C(0, 1, 2, 3, 4))
29 85-90 1 8 51559 list(C(0, 1, 2, 3, 4))
30 85-90 1 8 51559 list(C(0, 1, 2, 3, 4))
31 85-90 1 8 51559 list(C(0, 1, 2, 3, 4))
For every first unique zipcode and age_group, I want to expand the corresponding first single_age list vertically. For example, if the first unique zipcode is 1, age_group is 0-4 and age_division is 2. Then the list should be in a vertical format where we have 0, 0, 1, 1, 2, 2, 3, 3, 4, 4. This is my code where it is giving me single_age as a list.
# Function to expand age groups
expand_age <- function(age_group, age_division) {
age <- rep(seq(as.numeric(strsplit(age_group, "-")[[1]][1]),
as.numeric(strsplit(age_group, "-")[[1]][2])),
each = age_division)
return(list(age))
}
# Apply the function to each row of the dataframe using mutate
df <- df_1 %>%
group_by(zipcode) %>%
mutate(single_age = list(expand_age(first(age_group), first(age_division))))
This is what the expected output should look like.
age_group age_division total_estimate zipcode single_age
<chr> <dbl> <dbl> <dbl> <dbl>
1 0-4 2 10 51557 0
2 0-4 2 10 51557 0
3 0-4 2 10 51557 1
4 0-4 2 10 51557 1
5 0-4 2 10 51557 2
6 0-4 2 10 51557 2
7 0-4 2 10 51557 3
8 0-4 2 10 51557 3
9 0-4 2 10 51557 4
10 0-4 2 10 51557 4
11 0-4 3 15 51558 0
12 0-4 3 15 51558 0
13 0-4 3 15 51558 0
14 0-4 3 15 51558 1
15 0-4 3 15 51558 1
16 0-4 3 15 51558 1
17 0-4 3 15 51558 2
18 0-4 3 15 51558 2
19 0-4 3 15 51558 2
20 0-4 3 15 51558 3
21 0-4 3 15 51558 3
22 0-4 3 15 51558 3
23 0-4 3 15 51558 4
24 0-4 3 15 51558 4
25 0-4 3 15 51558 4
26 85-90 1 8 51559 85
27 85-90 1 8 51559 86
28 85-90 1 8 51559 87
29 85-90 1 8 51559 88
30 85-90 1 8 51559 89
31 85-90 1 8 51559 90
It's always better to share your data with
dput, as @Onyambu asked.With the given format,
unnest_longer(df_1, single_age)will not work because:single_agehave just one object - a integer vector;C()(with capital "C") is different fromc().You can try this:
Before unnest, simplify it:
The output: