I got a data frame like this:
| Factory | Bread |
|---|---|
| A | a |
| A | a |
| B | c |
| B | b |
| B | d |
| C | a |
| D | e |
I want to find name of the factory with the most number of bread
I write two codes and got different answers.
1.
df %>%
group_by(factory, bread)%>%
summarise(n = n())%>%
arrange(desc(n))
df %>%
group_by(factory) %>%
mutate(number = length(unique(bread)))%>%
arrange(desc(number))
May I ask which one is the right code and why?
Thank you!!!!
We could use
n_distinctfromdplyrpackage:Output: