is there any benefit in using multiprocessing versus running several python interpreters in parallel for long-running, embarrassingly parallel tasks?
At the moment, I'm just firing up several python interpreters that run the analysis over slices of input data, each of them dumping the results into a separate pickle file. It is trivial to slice the input data as well as to combine the results. I'm using python 3.4 on OS X and linux for that.
Is rewriting the code with the multiprocessing module worth the effort? It seems to me that not, but then I'm far away from being expert...
Well, one advantage of multiprocessing is that there are tools for interprocess communication and you can have shared variables with different sorts of restrictions. But if you don't need that your approach is perfectly viable. What you are doing is basically what most map reduce systems automate. I am sure there is some slight performance penalty in running an entire other interpreter but it's probably insignificant.