When writing a Python package that uses Joblib in some modules to parallelize coarse grained tasks, what is the best protocol in terms of setting the number of worker processes (n_jobs
for the joblib.Parallel
class)?
From the code I have seen, my impression is that often n_jobs = 1
is the default, and the user has the option to manually change it. Is this the best practice? What are some relevant considerations to make when deciding how many processes to spawn by default?