I've read Airflow's FAQ about "What's the deal with start_date
?", but it still isn't clear to me why it is recommended against using dynamic start_date
.
To my understanding, a DAG's execution_date
is determined by the minimum start_date
between all of the DAG's tasks, and subsequent DAG Runs are ran at the latest execution_date
+ schedule_interval
.
If I set my DAG's default_args
start_date
to be for, say, yesterday at 20:00:00
, with a schedule_interval
of 1 day, how would that break or confuse the scheduler, if at all? If I understand correctly, the scheduler would trigger the DAG with an execution_date
of yesterday at 20:00:00
, and the next DAG Run would be scheduled for today at 20:00:00
.
Is there some concept that I'm missing?
First run would be at
start_date+schedule_interval
. It doesn't run dag onstart_date
, it always runs onstart_date+schedule_interval
.As they mentioned in document if you give
start_date
dynamic for e.g.datetime.now()
and give someschedule_interval
(1 hour), it will never execute that run asnow()
moves along with time anddatetime.now()+ 1 hour
is not possible