Update: there's maybe (not super sure at this point) a solution to it, mentioned at the end.
I've been struggling to find out why exception raised in
Pool workers wasn't propagating i.e. it was continuing silently. I added quite a lot of exceptions in my code to identify where it could eventually be caught, and found a weird behavior.
I have a class such as:
class Pipeline: def __call__(self, **kwargs): raise ValueError() for module in self.modules: print("Pipeline: call module: %s" % str(module)) # raise BaseException() raise ValueError() o = module(**kwargs) kwargs.update(o) return o, kwargs
which instances are called in a
Pool (in fact a
pathos.ProcessingPool not a
multiprocessing.Pool not sure what it may change in this context) i.e.
from pathos.multiprocessing import ProcessingPool as Pool with Pool(processes=n_thread) as pool: pool.map(run_exp_args, [...]) def run_exp_args([...]): [...] p = Pipeline(...) o, k = p(**kwargs)
As you could expect, this raises a
ValueError before entering the loop, in
Pipeline.__call__ (which is confirmed by the line number of the trace), it stops the program and shows the relevant trace.
What is strange, is that, if I comment this first
ValueError (but keep the one in the loop), this exception is never propagated, it's just ignored.
Now if I raise instead an
BaseException by uncommenting the line right above, this exception is raised, I see the trace, but the whole program isn't stopped.
I've been experimenting with
n_thread=1 just to be sure my output is relevant.
raise makes a difference... (and removing it eventually solves the problem)
if I remove the print in the loop, the
ValueError is shown, and stops the program.
class Pipeline: def __init__(self, *args): self.modules = [*args] def __call__(self, **kwargs): for module in self.modules: print() raise ValueError() o = module(**kwargs) kwargs.update(o) return o, kwargs
runs without error, when
class Pipeline: def __init__(self, *args): self.modules = [*args] def __call__(self, **kwargs): for module in self.modules: raise ValueError() o = module(**kwargs) kwargs.update(o) return o, kwargs
fails (which is what I want!).
The thing is, in practice, the real logic happens in the
o = module(...) calls, which contains quite a lot of instructions, including
That's a really annoying problem as I cannot trust my program in parallel mode, so I have to run it without the Pool (especially since there's important
assert that are ignored).
Do you have any ideas?
(1) the idea of raising a
BaseException comes from Exception thrown in multiprocessing Pool not detected but the thread does not really solve the true question: why the two
ValueError behave differently?
Using: https://gist.github.com/oseiskar/dbd38098038df8944e21b41c42668440 seems to fix my problems..