Is it more efficient to pass objects to parallel::parLapply and parallel::parLapplyLB as function arguments or to export them with parallel::clusterExport? I.e.
parallel::parLapply(cl, 1:1000, function(y, x1, x2, x3, x4, x5) {
...
}, x1, x2, x3, x4, x5)
or
parallel::clusterExport(cl, c("x1", "x2", "x3", "x4", "x5"))
parallel::parLapply(cl, 1:1000, function(y) {
...
})
Non parallel functions e.g. do by default not make copies of the arguments passed to them. They only create copies when the objects are modified. I was wondering, whether the two above mentioned parallel options were differently good at avoiding unnecessary object copies.
For the large data set with both of your versions, I experienced memory management difficulties. I can suggest: