calculate the quartiles of certain values per group

290 views Asked by At

I would like to create a variable called percentile, with the quartiles of certain values per group. I have the following dataset, and I would like to create the last variable percentile:

  id group value
1  1     1     1
2  2     1     2
3  3     1     3
4  4     1     4
5  5     2    10
6  6     2    20
7  7     2    30
8  8     2    40

The following is the expected outcome.

id group value percentile
1  1     1     1
2  1     2     2
3  1     3     3 
4  1     4     4
5  2     10    1
6  2     20    2
7  2     30    3
8  2     40    4

So far I have tried the following using the library dplyr:

df <- df  %>% group_by(group) %>% within(df, percentile <- as.integer(cut(value, quantile(value, probs=0:4/4), 
                                                              include.lowest=TRUE)))

But it does not seem to work. It does not produce any variable called percentile and neither gives me an error

1

There are 1 answers

5
erasmortg On BEST ANSWER

Is this what you need?:

> df$percentile = ave(df$value, df$group, FUN=function(x) ecdf(x)(x))

re: If you want the to 4, you could:

df$percentile = factor(df$percentile)
levels(df$percentile) <- 1:4