I'm working with a large dataset and doing some calculation with the aggregate() function.
This time I need to group by two different columns and for my calculation I need a user defined function that also uses two columns of the data.frame. That's where I'm stuck.
Here's an example data set:
dat <- data.frame(Kat = c("a","b","c","a","c","b","a","c"),
Sex = c("M","F","F","F","M","M","F","M"),
Val1 = c(1,2,3,4,5,6,7,8)*10,
Val2 = c(2,6,3,3,1,4,7,4))
> dat
Kat Sex Val1 Val2
a M 10 2
b F 20 6
c F 30 3
a F 40 3
c M 50 1
b M 60 4
a F 70 7
c M 80 4
Example of user defined function:
sum(Val1 * Val2) # but grouped by Kat and Sex
I tried this:
aggregate((dat$Val1),
by = list(dat$Kat, dat$Sex),
function(x, y = dat$Val2){sum(x*y)})
Output:
Group.1 Group.2 x
a F 1710
b F 600
c F 900
a M 300
b M 1800
c M 2010
But my expected output would be:
Group.1 Group.2 x
a F 610
b F 120
c F 90
a M 20
b M 240
c M 370
Is there any way to do this with aggregate()?
As @jogo suggested :
Or in a
tidyverse
styleOr with
data.table