I am having trouble with mulitprocessing
's pool
. In the past, for something very similar to the code setup as follows, pool.map()
iterated through out the list (I could be wrong and perhaps had something else call the remaining items in the list?) but that does not seem to be the case here. Here the code works but only for the first 16 items, 16 being the number of cores I have on my machine.
What is the expected behavior for code that is setup as follows:
def export_task(item):
subject, outputPathChunk = item
subject.export_hdf5(outputPathChunk)
And then
import multiprocessing
pool = multiprocessing.Pool(processes=multiprocessing.cpu_count())
pool.map(export_task,subs)
pool.close()
Where subs
is a 600 items list of tuples and each tuple is a vaex
table (a pandas alternative for larger data) and a path.
There is a vaex
related warning for the first 16 executions for export_task
and I am wondering if that is choking pool.map
. That would be a simple issue to work around but doing a simple sample_table.export_hdf5(sample_path)
sanity check does not produce the same warning.
Is the pool
stalling because the function does not return an output and only does file I/O? or is this because of the vaex
warning that is only produced within the pool?