I need to run a function in a process, which is completely isolated from all other memory, several times. I would like to use multiprocessing
for that (since I need to serialize a complex output coming from the functions). I set the start_method
to 'spawn'
and use a pool with maxtasksperchild=1
. I would expect to get a different process for each task, and therefore see a different PID:
import multiprocessing
import time
import os
def f(x):
print("PID: %d" % os.getpid())
time.sleep(x)
complex_obj = 5 #more complex axtually
return complex_obj
if __name__ == '__main__':
multiprocessing.set_start_method('spawn')
pool = multiprocessing.Pool(4, maxtasksperchild=1)
pool.map(f, [5]*30)
pool.close()
However the output I get is:
$ python untitled1.py
PID: 30010
PID: 30009
PID: 30012
PID: 30011
PID: 30010
PID: 30009
PID: 30012
PID: 30011
PID: 30018
PID: 30017
PID: 30019
PID: 30020
PID: 30018
PID: 30019
PID: 30017
PID: 30020
...
So the processes are not being respawned after every task. Is there an automatic way of getting a new PID each time (ie without starting a new pool for each set of processes)?
You need to also specify
chunksize=1
in the call topool.map
. Otherwise, multiple items in your iterable get bundled together into one "task" from the perception of the worker processes:Output doesn't have repeated PIDs now: