Which is the more efficient way to accumulate a vector?

157 views Asked by At

Which of the following is more efficient in terms of time to completion?

one.way = function(n) { if (n) c(n, one.way(n-1)) }
two.way = function(n) { if (n) c(two.way(n-1), n) }

The emphasis here is on which order to combine a singleton vector with another vector whose length is likely greater than one.

How would your answer change if you need to include the time to reverse the vector after constructing it with one of the above functions? It may or may not be known at the time of combination in which order the combined objects are going to be needed.

1

There are 1 answers

1
CPak On

To answer your question, they're about equal

Use microbenchmark to test the performance of two fast functions

one.way = function(n) { c(n, 1) }
two.way = function(n) { c(1, n) }

n <- 2:100

library(microbenchmark)
microbenchmark(one.way(n), two.way(n), times=1e5)

#       expr   min    lq     mean median    uq      max neval
# one.way(n) 1.399 1.866 2.009775  1.867 1.867 2817.733 1e+05
# two.way(n) 1.399 1.866 2.086267  1.867 1.867 3394.808 1e+05

Notice the lower-quartile, median, and upper-quartile are all the same. The means are slightly different but it's not significant. See below

myfun <- function(n) {
              temp <- microbenchmark(one.way(n), two.way(n), times=1e5)
              df <- data.frame(one.way = mean(temp$time[temp$expr=="one.way(n)"]),
                           two.way = mean(temp$time[temp$expr=="two.way(n)"]))
              return(df)
     }

library(purrr)
test <- map_df(1:100, ~myfun(n))

t.test(test$one.way, test$two.way)

# data:  test$one.way and test$two.way
# t = -1.6506, df = 182.72, p-value = 0.1005
# alternative hypothesis: true difference in means is not equal to 0