I am using this code to scrape an API:
submissions = get_submissions(1)
with futures.ProcessPoolExecutor(max_workers=4) as executor:
#or using this: with futures.ThreadPoolExecutor(max_workers=4) as executor:
for s in executor.map(map_func, submissions):
collection_front.update({"time_recorded":time_recorded}, {'$push':{"thread_list":s}}, upsert=True)
It works great/fast with threads but when I try to use processes I get a full queue and this error:
File "/usr/local/lib/python3.4/dist-packages/praw/objects.py", line 82, in __getattr__
if not self.has_fetched:
RuntimeError: maximum recursion depth exceeded
Exception in thread Thread-3:
Traceback (most recent call last):
File "/usr/lib/python3.4/threading.py", line 920, in _bootstrap_inner
self.run()
File "/usr/lib/python3.4/threading.py", line 868, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.4/concurrent/futures/process.py", line 251, in _queue_management_worker
shutdown_worker()
File "/usr/lib/python3.4/concurrent/futures/process.py", line 209, in shutdown_worker
call_queue.put_nowait(None)
File "/usr/lib/python3.4/multiprocessing/queues.py", line 131, in put_nowait
return self.put(obj, False)
File "/usr/lib/python3.4/multiprocessing/queues.py", line 82, in put
raise Full
queue.Full
Traceback (most recent call last):
File "reddit_proceses.py", line 64, in <module>
for s in executor.map(map_func, submissions):
File "/usr/lib/python3.4/concurrent/futures/_base.py", line 549, in result_iterator
yield future.result()
File "/usr/lib/python3.4/concurrent/futures/_base.py", line 402, in result
return self.__get_result()
File "/usr/lib/python3.4/concurrent/futures/_base.py", line 354, in __get_result
raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
Note that originally the processes worked great and very fast for small data retrievals, but now they're not working at all. Is this a bug or what's going on that the PRAW object would cause a recursion error with Processes but not with Threads?
I had a similar problem moving from threads to processes only I was using executor.submit. I think this might be the same problem you have, but I can't be sure because I don't know in what context your code is running.
In my case what happened was: I was running my code as a script, and I didn't use the always recommended
if __name__ == "__main__":
. It looks like when running a new process with the executor, python loads the .py file and runs the function specified in submit. Because it loads the file, the code that exists on the main file (not inside functions or the above if sentence) gets ran, so each process would run again a new process, having an infinite recursion.It looks like this doesn't happen with threads.