Automatically Triggering Airflow DAG in Cloud Composer on GCP without Cloud Functions

46 views Asked by At

Hello Stack Overflow community,

I'm currently working on a project at work where I need to automatically trigger an Airflow DAG in Cloud Composer on Google Cloud Platform (GCP) whenever a .csv file is uploaded to a Google Cloud Storage (GCS) bucket. However, there's a restriction in my organisation that prevents the use of Cloud Functions for this purpose.

I've successfully implemented a solution using Cloud Functions, but due to organisational constraints, I need to explore alternative methods that are both efficient and cost-effective. I would appreciate any guidance or suggestions on achieving this without relying on Cloud Functions.

If you've encountered a similar scenario or have ideas on how to set up this file upload trigger without using Cloud Functions, your insights would be incredibly valuable.

Thank you in advance for your help!

As mentioned, I know how to do this using Cloud Functions, but this is prohibited in my organisation, so I need to find alternative methods.

1

There are 1 answers

0
Yusuf Quazi On

Something I can think of is Polling with a Time-Based DAG

Set up a DAG in Airflow that runs on a regular schedule (e.g., every 5 minutes). Within the DAG, use a GoogleCloudStorageListOperator to list the files in your GCS bucket. Compare the current file list with a previously stored list. If a new file appears, trigger the necessary downstream Airflow tasks or a different DAG.

hope this help :)

ref : https://airflow.apache.org/docs/apache-airflow/1.10.12/_api/airflow/contrib/operators/gcs_list_operator/index.html