Running time foreach package

867 views Asked by At

I have problem by using foreach package in R. In fact, when I compile this code :

tmp=proc.time()
x<-for(i in 1:1000){sqrt(i)} 
x
proc.time()-tmp

and this code :

tmp=proc.time()
x<- foreach(i=1:1000) %dopar% sqrt(i)
x
proc.time()-tmp

The R console posts for Parallel Computing :

utilisateur     système      écoulé 
      0.464       0.776       0.705  

and for the normal loop :

utilisateur     système      écoulé 
      0.001       0.000       0.001 

So the normal loop runs faster... Is it normal?

Thanks for your help.

2

There are 2 answers

4
Max Candocia On

Parallel processing won't speed up simple operations like sqrt(x). Ideally you use it for more complex operations, or you do something like,

x<- foreach(i=0:9,combine = 'c') %dopar% sqrt(seq(i*10000000,(i+1)*10000000-1))
x

It takes more time to switch processes than it does to those tasks. If you look at the processors used in your system monitor/task manager, you'll see that only one processor is used, regardless of the backend you set up.

Edit: It seems that you have no parallel backend set up for your foreach loop, so it will default to sequential mode anyway. An easy way to set up the parallel backend is

library(doParallel)
ncores = detectCores()
clust = makeCluster(ncores - 2)
registerDoParallel(clust)
#do stuff here
#...
stopCluster(clust)

Depending on your system, you may need to do more outside of R in order to get the backend set up.

0
Andrie On

Here is some test code you can use to set up a parallel experiment on Windows:

library(foreach)
library(doParallel)

cl <- makePSOCKcluster(2)
registerDoParallel(cl)

system.time({
  x <- foreach(i=1:1000) %do% Sys.sleep(0.001)
})

system.time({
  x <- foreach(i=1:1000) %dopar% Sys.sleep(0.001)
})

stopCluster(cl)

You should find that the parallel implementation runs roughly half the time as serial:

> system.time({
+   x <- foreach(i=1:1000) %do% Sys.sleep(0.001)
+ })
   user  system elapsed 
   0.08    0.00   12.55 
> 
> system.time({
+   x <- foreach(i=1:1000) %dopar% Sys.sleep(0.001)
+ })
   user  system elapsed 
   0.23    0.00    6.09 

Note that parallel computing is not a silver bullet. There is a fixed startup cost as well as a communication cost. See Amdahl's law

In general, it is only worth doing parallel computing if your task is taking a long time to run.