I am using joblib for multiprocessing of my code that should be done for 10 different instances. My code is complex and I am using dill for pickling and the size of the generated pickle file is about 80M. When I dont use multiprocessing, pickling takes about one minutes but when I use multiprocessing for 10 different instances (my laptop has 10 cores and 10 instances are running in parallel, and the instances don't use any shared data), I see that the pickling for each instance takes about 10 minutes while I expect that it takes one minute as before multiprocessing. A summery of my code is as follows:
import dill as pickle
from joblib import Parallel, delayed, parallel_backend
from zipfile import ZipFile
def experiment(obj):
zip_file= "temp.zip"
with ZipFile(zip_file, 'w') as f:
pickle.dump(obj, f, protocol=pickle.DEFAULT_PROTOCOL)
if __name__ == '__main__':
# objs is a list of complex dictionary and objects
with parallel_backend('threading', n_jobs=-1):
out = Parallel()(delayed(experiment)(obj) for obj in objs)
Anybody knows the reason?