I am trying to calculate the population under 20 by race for each county in MN using the American Community Survey in R. Using Tidycensus I am aware this can be done using the B01001H variables for each race and age group in R. However I would need to aggregate all the variables for those under 20 for each racial group. According to this webpage (https://www.census.gov/content/dam/Census/library/publications/2018/acs/acs_general_handbook_2018_ch08.pdf) while aggregating the estimates is merely the sum of each of the subgroup values, aggregating the margin of error requires I calculate this formula:
MOE = sqrt(moe_1^2 + moe_2^2 + ... + moe_n^2)
for each of the MOEs within a subgroup. So how exactly can I use tidyverse to accurately calculate this aggregated MOE value?
Here is what my code looks like so far:
## age race
age_vars_male = c(w1="B01001H_003",w2="B01001H_004",w3="B01001H_005",w4="B01001H_006",
b1="B01001B_003",b2="B01001B_004",b3="B01001B_005",b4="B01001B_006",
AN1="B01001C_003",AN2="B01001C_004",AN3="B01001C_005",AN4="B01001C_006",
AS1="B01001D_003",AS2="B01001D_004",AS3="B01001D_005",AS4="B01001D_006",
H1="B01001I_003",H2="B01001I_004",H3="B01001I_005",H4="B01001I_006")
## obtaining variables listed above for MN counties
pop_un20 <- get_acs(geography = "county",
variables = age_vars_male,
state = "MN",
geometry=T)
pop_un20 = pop_un20 %>% mutate(Race = case_when(variable %in% c("w1","w2","w3","w4") ~ "White",
variable %in% c("b1","b2","b3","b4") ~ "Black",
variable %in% c("AN1","AN2","AN3","AN4") ~"AI/AN",
variable %in% c("AS1","AS2","AS3","AS4") ~"Asian",
variable %in% c("H1","H2","H3","H4") ~"Hispanic/Latino"),
moe_sqrd = moe^2) %>% select(-variable)
moe_aggregate = pop_un20 %>% group_by(NAME,Race) %>% summarise(moe_aggregate = sqrt(sum(moe_sqrd,na.rm = T))) %>% st_set_geometry(NULL)
est_aggregate = pop_un20 %>% group_by(NAME,Race) %>% summarise(estimate_aggregate = sum(estimate,na.rm = T)) %>% st_set_geometry(NULL)
pop_under20 = pop_un20 %>% right_join(moe_aggregate, by = c("NAME","Race")) %>% right_join(est_aggregate, by = c("NAME","Race")) %>%
select(-estimate,-moe,moe_sqrd)
I've calculated what I requested by first creating a column for moe squared, then taking the square root of the sum for each group and race. However is there a way to do this in one go?
tidycensus has a function,
moe_sum()
, that does this for you. Adapting your code: