I am trying to calculate the incidence of a disease per year and per age-category. I also want to apply direct standardization. Im using the function ageadjust.direct
(package epitools
).
age_cat persondays_individual contactfirst_cat ESPpop personyears year
1 <40 38624 3 26938 106. 2011
2 >90 367 2 691 1.01 2011
3 41-50 111208 10 14214 305. 2011
4 51-60 222777 29 13534 610. 2011
5 61-70 219567 41 11403 602. 2011
6 71-80 102593 26 77484 281. 2011
7 81-90 20056 10 3945 54.9 2011
8 <40 32673 6 26938 89.5 2012
9 >90 366 0 691 1.00 2012
10 41-50 102182 11 14214 280. 2012
11 51-60 209241 29 13534 573. 2012
12 61-70 224701 33 11403 616. 2012
13 71-80 104898 26 77484 287. 2012
14 81-90 23771 9 3945 65.1 2012
df<-structure(list(age_cat = structure(c(1L, 2L, 3L, 4L, 5L, 6L,
7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L), .Label = c("<40", ">90", "41-50",
"51-60", "61-70", "71-80", "81-90"), class = "factor"), persondays_individual = c(38624,
367, 111208, 222777, 219567, 102593, 20056, 32673, 366, 102182,
209241, 224701, 104898, 23771), contactfirst_cat = c(3, 2, 10,
29, 41, 26, 10, 6, 0, 11, 29, 33, 26, 9), ESPpop = c(26938, 691,
14214, 13534, 11403, 77484, 3945, 26938, 691, 14214, 13534, 11403,
77484, 3945), personyears = c(105.819178082192, 1.00547945205479,
304.679452054795, 610.34794520548, 601.553424657534, 281.076712328767,
54.9479452054794, 89.5150684931507, 1.0027397260274, 279.950684931507,
573.26301369863, 615.619178082192, 287.391780821918, 65.1260273972603
), year = c("2011", "2011", "2011", "2011", "2011", "2011", "2011",
"2012", "2012", "2012", "2012", "2012", "2012", "2012")), row.names = c(NA,
14L), class = "data.frame")
I would like to calculate the crude incidence ($contactfirst_cat / personyears), the adjusted incidence and the 95% CI per year and age category.
Desired output
age_cat persondays_individual contactfirst_cat ESPpop personyears year crude.rate adj. rate lci uci
1 <40 38624 3 26938 106. 2011
2 >90 367 2 691 1.01 2011
3 41-50 111208 10 14214 305. 2011
4 51-60 222777 29 13534 610. 2011
5 61-70 219567 41 11403 602. 2011
6 71-80 102593 26 77484 281. 2011
7 81-90 20056 10 3945 54.9 2011
8 <40 32673 6 26938 89.5 2012
9 >90 366 0 691 1.00 2012
10 41-50 102182 11 14214 280. 2012
I've tried the following code (which i got from Calculating crude and age-standardized rates, rate differences, and rate ratios with CIs by subgroups)
df%>%
group_by(year, age_cat) %>%
summarise(age_adjust = list(ageadjust.direct(count = contactfirst_cat, #count of events
pop = personyears, #person years of DFpop
rate = NULL,
stdpop = ESPpop, #standard population (European standard population per age_cat)
conf.level = 0.95))) %>%
mutate(age_adjust = map(age_adjust, as.data.frame.list)) %>%
ungroup()
However, this code doesnt give me the crude/adj rates and CI's per subgroup (year and age_cat), but i get only 1 observation of these 4 columns.
How can i create new columns for crude.rate, adj.rate, lci and uci PER year and age_cat? Any help would be very appreciated! Many thanks in advance!
EDIT When I run the script in Quinten's answer i get the following result:
crude.rate adj.rate lci uci
1 0.06070314 0.07801597 0.06298263 0.09764104
The output Quinten shows as df_2 is exactly what i want however. Im not sure what im doing wrong.
First you need to
unnest
your code instead ofungroup
like this:Outputd
df_2
:After you can bind your two dataframes to get the desired dataframe you want like this:
Output: