How to make subgroups by prefixes from ICD data?

68 views Asked by At

I have a large ICD-10 data and I would like to create subgroups and get a sum out of it.

For example, I have 'JAL01, JAL20 and JAL21' and I would need a sum of all the codes starting with 'JAL'. How do I do that?

1

There are 1 answers

0
zx8754 On BEST ANSWER

Substring first 3 letters, then group by and sum:

# example data
df1 <- data.frame(icd = c("JAL01", "JAL20", "JAL21", "foo11", "foo22"),
                  x = 1:5)

# get 1st 3 letters
df1$grp <- substr(df1$icd, 1, 3)

# get sum per group
aggregate(x ~ grp, df1, sum)
#   grp x
# 1 foo 9
# 2 JAL 6