Lookup inside mutate pipe

77 views Asked by At

I have a tibble (counts_table) that has the amount of times each item is repeated in the dataset:

item count
A 6
B 3
C 3
D 3

In my working dataset, I have the also the variable item, along with many others. I want to change one particular (amount) and I wanted it scaled down, divided over however many times the item has been repeated.

I thought of using mutate() within a pipe, but I get stuck when trying to define some sort of lookup function. In pseudocode:

  • Look up the item by its name in the counts_table tibble, and find the associated count (say n)

  • Then mutate the amount to amount / n

I'm really struggling to define the lookup within the pipe. I have tried with:

df %>%
  mutate(amount = amount / counts_table[counts_table$item == item, "count"]) 

But I get the following error:

✖ Logical subscript `counts_table$item == item` must be size 1 or 287, not 3768. 
ℹ longer object length is not a multiple of shorter object length 

When the second bit is run on its own:

counts_table[counts_table$item == "string", "count"]

It does give me the value from the counts_table tibble (say 6), so why does it not apply the 6 to the mutate function in the pipe?

1

There are 1 answers

1
PGSA On

Absent a reproducible example, I think this is worth a shot:

library(dplyr)
df |> left_join(counts_table |> select(item, count),
                by = c("item" = "item")) |>
  mutate(amount = amount/count) |>
  select(-count)

testing:

df <- data.frame(item = LETTERS[1:4],
                 amount = c(600,300,300,900))

counts_table <- data.frame(item = LETTERS[1:4],
                           count = c(6,3,3,3))
# gives
  item amount
1    A    100
2    B    100
3    C    100
4    D    300

Edit: this assumes all possble (items) appear in the counts_table - otherwise you will get NAs.