I am using the Parameter Driven approach with a Copy Data task in DataFactory.
Using a list that comes from a Azure SQL Database, that contains almost 100 tables from different source, I loop through this list and call a Copy Data task that will copy data from a source to a destination. The source being a collection of Azure and OnPrem SQL Server Databases, the destination being a Azure Storage Account Blob Storage (data is stored as CSV files).
The reason for this so I don't have to build 100+ different pipelines to ingest data into the blob storage.
Now I have a scenario that is creating some havoc.
A few tables have fields that contain data that has linefeeds in them (multiline data) which in the CSV files created on the Blob Storage now also have these line feeds in them. When I read the data from BlobStorage these line feeds justify a new line when actually shouldn't.
Is there a way, considering my dynamic approach to ingesting data using Parameter Driven Copy Data Task, that I can prevent this?
It might also be noted that I use a Query (built by my dynamic list) instead of Table to obtain the source data. Perhaps I can format the data in SELECT query.
In order to do these changes dynamically, you can use dataflow activity instead of copy activity.
+Add
and then click+Add column pattern
.CRLF
you can give the expression as in below image.derive(each(match(true()), $$ = replace($$, char(13) + char(10),''))) ~> derivedColumn1
If column has multiline data, this replaces them to single line.
Since you are using query with the dynamic list, you can use SQL itself to format this. Syntax to replace
CRLF
in SQL is same as in dataflow.Syntax: