How to trigger a pipeline only once when multiple files are added with Event based trigger

2.3k views Asked by At

I have an event based trigger that set on an data lake gen 2 folder. I need to trigger my pipeline only once for all these 20 files together. But now when these 20 files are loaded at a time, the event based trigger is triggering the pipeline for each file(20 times). And each trigger execution will again execute other files in the folder after the execution of triggered file.

My pipeline has foreach activity to handle all the files in my path. But my expectation is to trigger the pipeline only once regardless of the no:of files loaded.

2

There are 2 answers

1
NiharikaMoola On

It's by design, when a storage event trigger is created, it runs whenever the matched file or pattern is found in the given folder.

Alternatively, you can archive the processed file, so the same file is not processed multiple times.

Or you can get the list of files using the Get Metadata activity and loop it using until activity until all files are received in the source to process all files at once.

1
Prashant On

Yes, as suggested by @NiharikaMoola-MT by GetMetadata activity, you can achieve up to a certain extent, but your pipeline will still trigger every time a file lands ADLS. Another way is, to create a separate new pipeline which will keep the count of files and create events for the main pipeline once reach a certain file count threshold.

There is a couple of other ways, the solution can be possible but from a slightly different angle/approach

  1. You can utilize Azure function activity to count the number of files and then trigger your pipeline from the function itself as a next step.

  2. you can consider a Powershell script for counting the number of files and triggering the pipeline accordingly.

  3. Slightly different approach(a bit expensive also) - Azure Logic Apps for file counting and pipeline trigger.