I have several classes defined in Python which represent jobs. In my orchestrator I define the needed functions for Airflow as follows:
from jobs.package.job import ToBeExecuted
def run_job(**context):
ti = context['ti']
date = context['ds']
job = ToBeExecuted()
input = ti.xcom_pull(task_ids='previous_job')
output = output.csv
job.run(input, output, date)
return output
As mentioned in the Airflow docs (https://pythonhosted.org/airflow/concepts.html?highlight=zip#packaged-dags), you cannot use external packages without packaging them.
But I just don't understand the described solution. You package everything in the zip folder, but then what? How do you launch it? How do you backfill it?