This task seems straightforward, but after looking at multiple answers posted in stackoverflow I still don't get the right answer, so I need help. I've been studying this post: How to rank within groups in R?
I have data from experiments in which multiple variables are collected, and I need to rank the performance of some products for each experimental condition. Here is a sample of the expected output for the ranking
column.
id customer location fluid water temperature speed time product response ranking
1 103365333 Acme International Newtown US light fluid 5 105 8 2 AK125 25.94 1
2 103365333 Acme International Newtown US light fluid 5 105 8 2 AK560 25.19 2
3 103365333 Acme International Newtown US light fluid 5 105 8 2 PR600 24.56 3
4 103365333 Acme International Newtown US light fluid 5 105 8 2 PR300 23.69 4
5 103365333 Acme International Newtown US light fluid 5 105 8 2 XY500 23.63 5
6 103365333 Acme International Newtown US light fluid 5 105 8 2 XYZ123 22.75 6
7 103365333 Acme International Newtown US light fluid 5 105 8 2 ABC567 21.50 7
8 103365333 Acme International Newtown US light fluid 5 105 8 2 Z12345 21.50 8
9 103365333 Acme International Newtown US light fluid 5 105 8 2 W21450 21.00 9
10 103365333 Acme International Newtown US light fluid 5 105 8 2 W21010 20.54 10
11 103365333 Acme International Newtown US heavy fluid 5 105 8 2 W20001 19.06 11
12 103365333 Acme International Newtown US heavy fluid 5 105 8 2 W22025 15.88 12
13 155259007 New Great Company Ghosttown CA residue good 10 105 8 2 AK125 13.52 1
14 155259007 New Great Company Ghosttown CA residue good 10 120 4 2 AK560 8.75 1
15 155259007 New Great Company Ghosttown CA residue good 10 120 4 2 PR600 6.00 2
16 155259007 New Great Company Ghosttown CA residue good 10 120 4 2 PR300 1.50 3
17 155259007 New Great Company Ghosttown CA residue good 10 120 4 2 XY500 1.50 4
18 155259007 New Great Company Ghosttown CA residue good 5 105 8 2 XYZ123 14.25 1
19 155259007 New Great Company Ghosttown CA residue good 5 105 8 2 ABC567 13.25 2
20 155259007 New Great Company Ghosttown CA residue good 5 105 8 2 Z12345 12.88 3
My goal is to rank the product
by the response
in decreasing order, as shown in expected output. Not all product
are used in all experiments, which makes it tricky.
I am trying my "standard" code pipe here:
df %>%
arrange(id, customer, location, fluid, water, temperature, speed, time, -response) %>%
group_by(id, customer, location, fluid, water, temperature, speed, time) %>%
mutate(ranking = dense_rank(response))
But all I get is the overall ranking, not per group. Do you see anything wrong with my code, or there is some limitation in the number of variables to use in group_by
? I've also tried the other ranking functions (which are all based on rank
though). Thanks.