I am trying to create a piece of parallel code to speed up the processing of a very large (couple of hundred million rows) array. In order to parallelise this, I chopped my data into 8 (my number of cores) pieces and tried sending each worker 1 piece. Looking at my RAM usage however, it seems each piece is send to each worker, effectively multiplying my RAM usage by 8. A minimum working example:
A = 1:16;
for ii = 1:8
data{ii} = A(2*ii-1:2*ii);
end
Now, when I send this data to workers using parfor
it seems to send the full cell instead of just the desired piece:
output = cell(1,8);
parfor ii = 1:8
output{ii} = data{ii};
end
I actually use some function within the parfor
loop, but this illustrates the case. Does MATLAB actually send the full cell data
to each worker, and if so, how to make it send only the desired piece?
In my personal experience, I found that using
parfeval
is better regarding memory usage thanparfor
. In addition, your problem seems to be more breakable, so you can useparfeval
for submitting more smaller jobs to MATLAB workers.Let's say that you have
workerCnt
MATLAB workers to which you are gonna handlejobCnt
jobs. Letdata
be a cell array of sizejobCnt x 1
, and each of its elements corresponds to a data input for functiongetOutput
which does the analysis on data. The results are then stored in cell arrayoutput
of sizejobCnt x 1
.in the following code, jobs are assigned in the first
for
loop and the results are retrieved in the secondwhile
loop. The boolean variabledoneJobs
indicates which job is done.Also, you can take this approach one step further if you want to save up more memory. What you could do is that after fetching the results of a done job, you can delete the corresponding member of
future
. The reason is that this object stores all the input and output data ofgetOutput
function which probably is going to be huge. But you need to be careful, as deleting members offuture
results index shift.The following is the code I wrote for this porpuse.