I am experiencing an issue where the django_celery_results_chordcounter
table is filling up fast making me run out of server space. It was growing from a few MBs to over 99GB.
I have tried resolving this by setting CELERY_RESULT_EXPIRE=60
hoping that the celery backend cleanup task will help me clean up the table every minute but that was not happening.
I ran the task and by the time the table had grown to about 7GB, I truncated it on the psql shell. This is definitely not a solution but I had to do this so that the task can be successful without increasing server resources.
Here are the celery tasks leading to this problem. Items can be hundreds of thousands to millions.
Server specs: 16vCPUs, 64GiB memory
@celery_app.task(ignore_result=True)
def get_for_one(item_id):
# an IO-bound task
pass
@celery_app.task(ignore_result=True)
def get_for_many(parent_id):
tasks = [
group(
get_for_one.s(item.id)
for item in Item.objects.filter(
owner__isnull=True, parent_id=parent_id
).iterator()
)
]
chord(tasks)(get_for_many_callback.si(parent_id))
celery==5.2.7
Django==4.1.1
django-celery-beat==2.4.0
django-celery-results==2.4.0
Celery runs the built-in cleanup periodic task daily at 4 am by default, so it won't necessarily clean the results right after they expire (but wait until the next scheduled cleanup).
If you want to run the cleanup task more often, you can schedule your own interval in
CELERY_BEAT_SCHEDULE
: