How can I pass output from a filter activity directly to a copy activity in ADF?

2.6k views Asked by At

I have 4000 files each averaging 30Kb in size landing in a folder on our on premise file system each day. I want to apply conditional logic (several and/or conditions) against details in their file names to only move files matching the conditions into another folder. I have tried linking a meta data activity which gets all files in the source folder with a filter activity which applies the conditional logic with a for each activity with an embedded copy activity. This works but it is taking hours to process the files. When running the pipeline in debug the output window appears to list each file copied as a line item. I’ve increased the batch count setting in the for each to 50 but it hasn’t improved things. Is there a way to link the filter activity directly to the copy activity without using for each activity? Ie pass the collection from the filter straight into copy’s source. Alternatively, some of our other pipelines just use the copy activity pointing to a source folder and we configure its filefilter setting with a simple regex using a combination of * and ?, which is extremely fast. However, in this particular scenario, my conditional logic is more complex and I need to compare attributes in each file’s name with values to decide if the file should be moved. The filefilter setting allows dynamic content so I could remove the filter activity completely, point the copy to the source folder and put the conditional logic in the filefilter’s dynamic content area but how would I get a reference to the file name to do the conditional checks?

1

There are 1 answers

0
Trent Tamura On

Here is one solution:

  1. Write array output as text to a .json in Blob Storage (or wherever). Here are the steps to make that work:

Copy Data Source:

Copy Data Source

Copy Data Sink:

Copy Data Source

  1. Write the json (array output) to a text file that has the name of the files you want to copy.

Copy Activity Source (to get it from JSON to .txt):

2nd Copy to convert json to txt

Sink will be .txt file in your Blob.

  1. Use that text file in your main copy activity and use the following setting:

Copy using List of Files

This should copy over all the files that you identified in your Filter Activity.

I realize this is a work around, but really is the only solution for what you are asking. Otherwise there is no way to link a filter activity straight to a copy activity.