We daily receive files to fileShare and it increases number of files save in that folder. So, ADF pipeline takes lot of time to pick the latest file. I want to know how to reduce the time that pipeline takes to go through all the files in the folder. Thank you!

This is the overview of my pipelines

enter image description here

enter image description here

enter image description here

2

There are 2 answers

2
Chen Hirsh On

On your initial Get Metadata activity (Get file List), limit the files only to those modified in the last day:

metadata activity

where start time is 24 hours ago:

@subtractFromTime(utcnow(),1440,'Minute')

and end time is now:

utcnow()

Since you want to find the latest file, bringing only files modified in last 24 hours will reduce the number of iteration in the foreach activity and improve run times.

5
Rakesh Govindula On

Adding to @Chen Hirsh, as your file uploading will be one file per everyday, Get meta data activity will give only one file(latest uploaded file) in the child items array.

So, no need to use the For loop to determine the latest file, you can directly use the Get meta data activity output for the latest file name.

Use the same expression for the start time as above answer. For end time, use @utcnow() to avoid the above error. After Get meta data activity use copy activity like below.

enter image description here

Use the below expression for the filename. Use dataset parameter in the source dataset for the filename and give the below expression for that parameter.

@activity('Get Metadata1').output.childItems[0].name

NOTE: This works only when the time difference between your file uploads is 24 hours. If it changes you need to change the start time expression as per the time interval.