Connection Error when using Foreach loop in R

226 views Asked by At

I'm using a foreach loop to try and speed up some data processing I'm doing. I'd upload the full code, but its about 2k lines long so that doesn't seem worthwhile. Basically, I have a bunch of matrices (15 wide and 300 to 1500 long) that I need to pass through Mplus using mclust. I have a for loop which wraps around the foreach loop, which contains the mclust model fitting. Something like this:

registerDoParallel(4)

for (i in 1:10) {
 if (i==1) {data=load(file.rda)} #I've broken the data into 10 smaller chunks
 if (i==2) ...
 out <- foreach (sim=1:length(data), .packages=c('mclust','MplusAutomation')) %dopar% {
 #Proceed to fit various models in Mplus and saving the important output to a matrix as
 results[1:130] 
#this is so the thing that gets reported is the list of results I need and not a singular value. 
 }
 if (i==1) {save(out, "out.file.rda")}
 if (i==2) ... 
}

Anyway, I know the code works on smaller data batches (for instance, if I tell it to run only on the first ten in each of the datasets, it can run clean through without issue. However, when I ramp this up to running on the full dataset, I get errors like this:

Error in { : task 175 failed - "cannot open the connection"

It seems to happen at different points during the script, not always at the same time/place. I've tried messing with how many cores it uses (4-6), how much data it loads in at any one time (all 6.6 GB at once to 1/10th of that), I've increased the working memory (memory.limit(size=56000)), but none of these changes have allowed the code to run without error. In fact, it's never managed to complete one of the i loops through yet.

Any suggestions?

0

There are 0 answers