Interpret formulae/operators as functions

67 views Asked by At

Is it possible in R to assign custom functions to mathematical operators (eg. *, +) or interpret the formulae supplied with as.formula() as a directive to evaluate?

Specifically, I would like * to be interpretted as intersect(), and + as c(), so R would evaluate the expression

(a * (b + c)) * d) OR myfun(as.formula('~(a * (b + c)) * d)'), list(a, b, c, d))

AS

intersect(intersect(a, c(b, c)), d)

I'm able to produce the same outcome with gsub()ing an expression supplied as string in a while() loop, but I guess it's far from perfection.

Edit: I've mistakenly posted sum() instead of c(), so some answers may refer to the unedited version of the question.

Example:

############################
## Define functions

var <- '[a-z\\\\{\\},]+'
varM <- paste0('(', var, ')')
varPM <- paste0('\\(', varM, '\\)')

## Strip parentheses
gsubP <- function(x) gsub(varPM, '\\1', x)

## * -> intersect{}
gsubI <- function(x) {
    x <- gsubP(x)
    x <- gsub(paste0(varM, '\\*', varM), 'intersect\\{\\1,\\2\\}', x)
    return(x)
}

## + -> c{}
gsubC <- function(x) {
    x <- gsubP(x)
    x <- gsub(paste0(varM, '\\+', varM), 'c\\{\\1,\\2\\}', x)
    return(x)
}

############################
## Set variables and formula
a <- 1:10
b <- 5:15
c <- seq(1, 20, 2)
d <- 1:5

string <- '(a * (b + c)) * d'


############################
## Substitute formula

string <- gsub(' ', '', string)

while (!identical(gsubI(string), string) || !identical(gsubC(string), string)) {
    while (!identical(gsubI(string), string)) {
        string <- gsubI(string)
    }
    string <- gsubC(string)
}

string <- gsub('{', '(', string, fixed=TRUE)
string <- gsub('}', ')', string, fixed=TRUE)


## SHAME! SHAME! SHAME! ding-ding
eval(parse(text=string))
2

There are 2 answers

1
Nick Kennedy On BEST ANSWER

You can do this:

 `*` <- intersect
 `+` <- c

Be aware that if you do that in the global environment (not a function) it will probably make the rest of your script fail unless you intend for * and + to always do sum and intercept. Other options would be to use S3 methods and classes to restrict that usage.

* and + have special meaning within formulae, so I don't think you can override that. But you can use a formula as a way of passing an unevaluated expression as per @MrFlick's answer.

1
MrFlick On

A formula is really just a way to hold an unevaluated expression. You can create an environment where those functions are re-defined and then evaluate that expression in that environment. Here's a function that will do much of that for you. First, your sample input

a <- 1:10
b <- 5:15
c <- seq(1, 20, 2)
d <- 1:5

Now the function

myfun <- function(x, env=parent.frame()) {
    #check the formula
    stopifnot("formula" %in% class(x), length(x)==2)

    #redefine functions
    funcs <- list2env(list(
        `+`=base::c, 
        `*`=base::intersect
    ), parent=env)
    eval(x[[2]], funcs)
}

and we would call it with

myfun( ~(a * (b + c)) * d )
# [1] 1 3 5

Here we grab the variable values from the current enviroment, If you wanted to, we could also pass those as parameters

myfun <- function(x, ..., .dots=list()) {
    #check the formula
    stopifnot("formula" %in% class(x), length(x)==2)

    #check variables
    dotraw <- sapply(substitute(...()), deparse)
    dots <- list(...)
    if(length(dots) && is.null(names(dots))) names(dots)<-dotraw
    dots <- c(dots,.dots)
    stopifnot(all(names(dots)!=""))

    #redefine functions
    funcs <- list2env(list(
        `+`=base::c, 
        `*`=base::intersect
    ), parent=parent.frame())
    eval(x[[2]], dots, funcs)
}

Then you could do

myfun( ~(a * (b + c)) * d , a, b, c, d)
myfun( ~(a * (b + c)) * d , a=b, b=a, c=d, d=c)
myfun( ~(a * (b + c)) * d , .dots=list(a=a, b=b, c=c, d=d))
myfun( ~(a * (b + c)) * d , .dots=mget(c("a","b","c","d")))