I would like to create a variable called percentile, with the quartiles of certain values per group. I have the following dataset, and I would like to create the last variable percentile
:
id group value
1 1 1 1
2 2 1 2
3 3 1 3
4 4 1 4
5 5 2 10
6 6 2 20
7 7 2 30
8 8 2 40
The following is the expected outcome.
id group value percentile
1 1 1 1
2 1 2 2
3 1 3 3
4 1 4 4
5 2 10 1
6 2 20 2
7 2 30 3
8 2 40 4
So far I have tried the following using the library dplyr
:
df <- df %>% group_by(group) %>% within(df, percentile <- as.integer(cut(value, quantile(value, probs=0:4/4),
include.lowest=TRUE)))
But it does not seem to work. It does not produce any variable called percentile and neither gives me an error
Is this what you need?:
re: If you want the to 4, you could: