I am trying to get a little bit more comfortabl with the python2.7 multiprocessing module. So I have written a small script that takes filenames and the desired number of processes as input, and then starts multiple processes to apply a function to each file name in my queue. It looks like this:
import multiprocessing, argparse, sys
from argparse import RawTextHelpFormatter
def parse_arguments():
descr='%r\n\nTest different functions of multiprocessing module\n%r' % ('_'*80, '_'*80)
parser=argparse.ArgumentParser(description=descr.replace("'", ""), formatter_class=RawTextHelpFormatter)
parser.add_argument('-f', '--files', help='list of filenames', required=True, nargs='+')
parser.add_argument('-p', '--processes', help='number of processes for script', default=1, type=int)
args=parser.parse_args()
return args
def print_names(name):
print name
###MAIN###
if __name__=='__main__':
args=parse_arguments()
q=multiprocessing.Queue()
procs=args.processes
proc_num=0
for name in args.files:
q.put(name)
while q.qsize()!=0:
for x in xrange(procs):
proc_num+=1
file_name=q.get()
print 'Starting process %d' % proc_num
p=multiprocessing.Process(target=print_names, args=(file_name,))
p.start()
p.join()
print 'Process %d finished' % proc_num
The script works fine and starts a new process every time an old process finishes (I think that's how it works?), until all objects in the queue are used up. However, the script does not exit after completing the queue, but sits idle and I have to kill it using Ctrl+C
. What is the problem here?
Thanks for your answers!
Seems as if you've mixed a few things up there. You spawn a process, have it do its work, and wait for it to exit before starting a new process in the next iteration. Using this approach, you are stuck in sequential processing, there is no actual multiprocessing being performed here.
Maybe you want to take this as a starting point: