Speed up backfilling in Airflow (MWAA)

116 views Asked by At

We have an Airflow instance running on AWS MWAA with only one DAG file. The DAG file contains ~20 tasks which have some cross-dependencies, but most importantly have depends_on_past=True set. The DAG works just fine and gets executed in 1-2s, it has a schedule of 5 minutes.

However, we now need to run a backfill for the past 3 years (~300k DAG runs). This is extremely slow, after more than 24h hours there are only 800 runs completed so far. The reason can't be our DAG file or the tasks contained, as they get executed in <1s usually. It must be related to some internal overhead of Airflow. Apparently, Airflow is just very slow in picking up new tasks and scheduling them for execution?

How can we speed up this process?

We're using the following environment class:

Class: mw1.medium
Scheduler count: 5
Maximum worker count: 25
Minimum worker count: 1

With these custom configuration option applied:

core.dag_file_processor_timeout 180
core.dagbag_import_timeout 120
core.max_active_runs_per_dag 1000
core.max_active_tasks_per_dag 1000
scheduler.max_dagruns_to_create_per_loop 100
scheduler.max_threads 7
scheduler.scheduler_heartbeat_sec 1

Please note that MWAA applies diverging default config values: https://docs.aws.amazon.com/mwaa/latest/userguide/best-practices-tuning.html

0

There are 0 answers