It seems the following 2 snippets have the same behavior:
def sqr(a):
time.sleep(1.2)
print 'local {}'.format(os.getpid())
if a == 20:
raise Exception('fff')
return a * a
pool = Pool(processes=4)
A:
try:
r = [pool.apply_async(sqr, (x,)) for x in range(100)]
pool.close()
for item in r:
item.get(timeout=999999)
except:
pool.terminate()
raise
finally:
pool.join()
print 'main {}'.format(os.getpid())
B:
r = [pool.apply_async(sqr, (x,)) for x in range(100)]
pool.close()
for item in r:
item.get(timeout=999999)
pool.join()
Initially I thought that if I don't do terminate
, all other processes would run in the background even if the main process quits. But I checked htop
and it appears all sub processes quit as soon as the exception is hit.
When you call
pool.close()
, you're telling thePool
that no more tasks will be sent to it. That allows it to shutdown its worker processes as soon as the current queue of tasks is done being processed - no explicitterminate()
call required. This is mentioned in the docs:Note that it doesn't matter if the task completes successfully or throws an exception; either way, the task is done.
Additionally, all the worker processes in the
Pool
are started withdaemon=True
, which means they'll be terminated as soon as the parent process is ready to exit. In your case, you're callingget()
on each item being processed, which will cause the exception thrown in the child to be re-raised in the parent. When that happens, the parent exits, which automatically terminates all the worker processes.