Apply list of functions to list of values

11.2k views Asked by At

In reference to this question, I was trying to figure out the simplest way to apply a list of functions to a list of values. Basically, a nested lapply. For example, here we apply sd and mean to built in data set trees:

funs <- list(sd=sd, mean=mean)
sapply(funs, function(x) sapply(trees, x))

to get:

              sd     mean
Girth   3.138139 13.24839
Height  6.371813 76.00000
Volume 16.437846 30.17097

But I was hoping to avoid the inner function and have something like:

sapply(funs, sapply, X=trees)

which doesn't work because X matches the first sapply instead of the second. We can do it with functional::Curry:

sapply(funs, Curry(sapply, X=trees))

but I was hoping maybe there was a clever way to do this with positional and name matching that I'm missing.

4

There are 4 answers

1
Julien Navarre On BEST ANSWER

Since mapply use ellipsis ... to pass vectors (atomics or lists) and not a named argument (X) as in sapply, lapply, etc ... you don't need to name the parameter X = trees if you use mapply instead of sapply :

funs <- list(sd = sd, mean = mean)

x <- sapply(funs, function(x) sapply(trees, x))

y <- sapply(funs, mapply, trees)

> y
              sd     mean
Girth   3.138139 13.24839
Height  6.371813 76.00000
Volume 16.437846 30.17097
> identical(x, y)
[1] TRUE

You were one letter close to get what you were looking for ! :)

Note that I used a list for funsbecause I can't create a dataframe of functions, I got an error.

> R.version.string
[1] "R version 3.1.3 (2015-03-09)"
4
MrFlick On

You're basically going to need an anonymous function of some sort because there would be no other way to distinguish named parameters to the two different sapply calls. You've already shown an explicit anonymous function and the Curry method. You could also use magrittr

 library(magrittr)
 sapply(funs, . %>%  sapply(trees, .))
 # or .. funs %>% sapply(. %>%  sapply(trees, .))

but the point is you need something there to do the splitting. The "problem" is that sapply dispatches to lapply which is an internal function that seems determined to place the changing values as the beginning of the function call. You need something to reorder parameters and due to to the identical sets of parameter names it's not possible to tease that apart without a helper function to take care of the disambiguation.

The mapply function does allow you to pass a list to "MoreArgs" which allows a way to get around the named parameter conflict. This is intended to split between the items you should vectorize over and those that are fixed. Thus you can do

mapply(sapply, funs, MoreArgs=list(X=trees))
#               sd     mean
# Girth   3.138139 13.24839
# Height  6.371813 76.00000
# Volume 16.437846 30.17097
4
Rentrop On

Another approach using purrr would be:

require(purrr)

funs <- list(sd=sd, mean=mean)
trees %>% map_df(~invoke_map(funs, ,.), .id="id")

Important: Note the empty second argument of invoke_map to match by position. See ?purrr::invoke_map examples.

which gives you:

Source: local data frame [3 x 3]

      id        sd     mean
   <chr>     <dbl>    <dbl>
1  Girth  3.138139 13.24839
2 Height  6.371813 76.00000
3 Volume 16.437846 30.17097

Instead of rownames this approach gives you a column id containing the original columns.

0
egnha On

Though not as edifying nor as elegant as the solution presented by @Floo0, here is yet another take using tidyr and dplyr:

library(dplyr)
library(tidyr)

fns <- funs(sd = sd, mean = mean)
trees %>% 
    gather(property, value, everything()) %>% 
    group_by(property) %>% 
    summarise_all(fns)

#   A tibble: 3 x 3
#   property        sd     mean
#      <chr>     <dbl>    <dbl>
# 1    Girth  3.138139 13.24839
# 2   Height  6.371813 76.00000
# 3   Volume 16.437846 30.17097

This sequence of operations does a decent job of signaling intent, at the cost of extra verbosity.