Optimizing nested foreach dopar in R

1.7k views Asked by At

I'd like input on how my code below is structured. Would like to know if it needs to be organized in a different way to execute faster. Specifically, whether I need to be using foreach and dopar differently in the nested loops. Currently, the inner loop is the bulk of the work (ddply with between 1-8 breakdown variables, each of which has 10-200 levels), and that's what I have running in parallel. I left out the code details for simplicity.

Any ideas? My code, as organized below, does work, but it takes a few hours on a 6-core, 41gb machine. The dataset isn't that large (< 20k records).

for(m in 1:length(Predictors)){  # has up to three elements in the vector

  # construct the dataframe based on the specified predictor
  # subset the original dataframe based on the breakdown variables, outcome, predictor and covariates

  for(l in 1:nrow(pairwisematrixReduced)){  # this has 1-6 rows;subset based on correct comparison groups

    # some code here

    cl <- makeCluster(detectCores())  
    registerDoParallel(cl) 

    for (i in 1:nrow(subsetting_table)){  # this table has about 50 rows

      # this uses the columns specified by k in the glm; the prior columns will be used as breakdown variables
      # up to 10 covariates
      result[[length(result) + 1]] <- foreach(k = 11:17, .packages=c('plyr','reshape2', 'fastmatch')) %dopar% {   

        ddply( 
          df,
          b,   # vector of breakdown variables
          function(x) { 

           # run a GLM and manipulate the output

          ,.parallel = TRUE) # close ddply
      } # close k loop -- set of covariates
    } # close i loop -- subsetting table
  } #close l -- group combinations
} # close m loop - this is the pairwise predictor matrix 

stopCluster(cl)
result <- unlist(result, recursive = FALSE)
tmp2<-do.call(rbind.fill, result)
1

There are 1 answers

0
manotheshark On

Copied out of vignette("nested")

3 Using %:% with %dopar%

When parallelizing nested for loops, there is always a question of which loop to parallelize. The standard advice is...

You also are using foreach %dopar% along with ddply and .parallel=TRUE. With a six core processor (and presumably hyper threading) means the foreach block would start 12 environments and then the ddply would start 12 environments within each of those for 144 simultaneous environments. The foreach should be changed to %do% to be consistent with your questions text of running the inner loop in parallel. Or to make it cleaner, change both to foreach and use %dopar% for one loop and %:% for the other.