I am debugging a celery worker running in a docker container.
- Celery version: 5.2.7
- broker: rabbitmq
- results backend: redis
- profiling: py-spy 0.3.14
- docker: 24.0.7
I launch the celery worker with:
celery -A app.client worker --pool=threads -l debug --concurrency=1
I understand that this will launch one main and one worker thread.
Here is the py-spy flame graph.
I am currently reading some books about concurrency, so I'm trying to understand what is going on under the hood but there's a couple of points that I don't understand about the graph. I also want to make sure I do not have obvious bottlenecks in my application.
- Why is
poll (kombu/utils/eventio.py:83)represented that much (exactly 50.00%, just likerun (threading.py:953))? From my understanding, since I don't use any kind of rate limit in celery, polling should not be blocking and should return immediately. - Why is time spend in line 81 and in line 83 of
concurrent/futures/thread.pyrepresented separately?
I am mainly trying to understand if the two blocks poll (kombu/utils/eventio.py:83) and concurrent/futures/thread.py:81 are blocking and I should worry about them or if and why they are an artefact of how py-spy deals with this particular celery setup.
I tried profiling with py-spy and did not expect these blocks, except for the actual worker logic block.
I tried looking at the source code pointed to by py-spy, but still did not understand what would lead py-spy to represent execution that way.