Say we have a data frame,
library(tidyverse)
library(rlang)
df <- tibble(id = rep(c(1:2), 10),
grade = sample(c("A", "B", "C"), 20, replace = TRUE))
we would like to get the mean of grades grouped by id,
df %>%
group_by(id) %>%
summarise(
n = n(),
mu_A = mean(grade == "A"),
mu_B = mean(grade == "B"),
mu_C = mean(grade == "C")
)
I am handling a case where there are multiple conditions (many grades in this case) and would like to make my code more robust. How can we simplify this using tidyevaluation in dplyr 1.0?
I am talking about the idea of generating multiple column names by passing all grades at once, without breaking the flow of piping in dplyr, something like
# how to get the mean of A, B, C all at once?
mu_{grade} := mean(grade == {grade})
I actually found the answer to my own question from a post that I wrote 2 years ago...
I am just going to post the code right below hoping to help anybody that comes across the same problem.