Suppose I have a nice little data frame
df <- data.frame(x=seq(1,5),y=seq(5,1),z=c(1,2,3,2,1),a=c(1,1,1,2,2))
df
## x y z a
## 1 1 5 1 1
## 2 2 4 2 1
## 3 3 3 3 1
## 4 4 2 2 2
## 5 5 1 1 2
and I want to aggregate
a part of it:
aggregate(cbind(x,z)~a,FUN=sum,data=df)
## a x z
## 1 1 6 6
## 2 2 9 3
How do I go about making it programmatic? I want to pass:
- The list of variables to be aggregated
cbind(x,z)
- The grouping variable
a
(I will be using it in several other parts of the program, so passing the whole thingcbind(x,z)~a
is not helpful) - The environment within which the things are happening
My starting point is
blah <- function(varlist,groupvar,df) {
# I kinda like to see what I am doing here
cat(paste0(deparse(substitute(varlist)),"~",deparse(substitute(groupvar))),"\n")
cat(is.data.frame(df),"\n")
cat(dim(df),"\n")
# but I really need to aggregate this
return( aggregate(eval(deparse(substitute(varlist))~deparse(substitute(groupvar)),df),
FUN=sum,data=df) )
}
and it works halfway:
blah(cbind(x,z),a,df)
## [1] "cbind(x, z)~a"
## TRUE
## 5 4
## Error in FUN(X[[i]], ...) : invalid 'type' (character) of argument
So I am kind of able to build the character representation of the formula that I need, but putting it into aggregate()
fails.