I have an airflow(2.6.3) celery environment with webserver(2cpu), scheduler(8cpu) and worker(16cpu) ec2 instances. I have a use-case where some dags need to run more than 50 tasks simultaneously in parallel all together.
In airflow.cfg the parallelism is set to 32 and worker_concurrency to 16 with celeryexecutor and the dags were running but just 16 tasks at a time in parallel. In order to run more tasks simultaneously I changed the parallelism to 64, worker_concurrency to 32. The dags are scheduled to run every 30 minutes but now only alternate runs are success(and 32 tasks run in parallel) and other times they are stuck in 'up_for_retry' state with out running any task and fails after some time.
Is it a scheduler or worker issue? What are the optimum settings for this scenario? Is it better to have multiple workers with 4cpu each or 1 worker with 16cpu?