Why is using %do% loop is using multiple processors? Expected sequential loop

79 views Asked by At

I'm using foreach and reading up on it e.g.

My understanding is that you would use %dopar% for parallel processing and %do% for sequential.

As it happens I was having issues with %dopar% and while trying to debug I changed it to a what I thought was a sequential loop using %do%. I happened to have the terminal open and noticed all processors running while I ran the loop.

Is this expected?

Reproducible example:

library(tidyverse)
library(caret)
library(foreach)

# expected to see parallel here because caret and xgb with train()
xgbFit <- train(Species ~ ., data = iris, method = "xgbTree", 
                trControl = trainControl(method = "cv", classProbs = TRUE))

iris_big <- do.call(rbind, replicate(1000, iris, simplify = F))

nr <- nrow(iris_big)
n <- 1000 # loop over in chunks of 20
pieces <- split(iris_big, rep(1:ceiling(nr/n), each=n, length.out=nr))
lenp <- length(pieces)

# did not expect to see parallel processing take place when running the block below
predictions <- foreach(i = seq_len(lenp)) %do% {

  # get prediction
  preds <- pieces[[i]] %>% 
    mutate(xgb_prediction = predict(xgbFit, newdata = .))

  return(preds)
}

bah <- do.call(rbind, predictions)

enter image description here

1

There are 1 answers

3
F. Privé On BEST ANSWER

My best guess would be that these are processes still running from previous runs.

It is the same when using foreach::registerDoSeq()?

My second guess would be that predict runs in parallel.