I extract multiple object tables from salesforce to Azure data lake using Azure data factory. For this I use the copy data activity. The only problem is that it converts all column types to strings (whatever type in the source is converted to a string type in the sink):
I want to keep the original type of the source.
How can I prevent this from happening?
The sink for now is a csv file (delimited text) in an azure data lake container. Should I change this to an other format?
I am trying to understand more, What do you mean by preserving data type in csv.
The CSV format itself is agnostic to data types, the CSV does not retain explicit type metadata. If strict type adherence is crucial throughout your data pipeline, you might need to consider alternative formats.
Having said that. If your requirement is related to storing values as “quoted” vs unquoted.
In Azure Data Factory's Copy Data activity, when writing to a CSV file using the
DelimitedText
format, how values are written depends on thequoteAllText
setting:quoteAllText = true: If this setting is enabled, all values, regardless of data type, will be quoted in the CSV. For instance, it will look like:
quoteAllText = false (or not set): If this setting is disabled or not explicitly set, only string values that contain special characters (like commas, newlines, or other delimiters) will be quoted. Numeric values and strings without special characters will not be quoted. For example, for a mix of numeric and string values:
In most cases, if you're dealing with string values that might contain commas or other special characters, it's a good idea to enable
quoteAllText
to ensure that the CSV is parsed correctly by downstream applications. However, if you prefer a cleaner CSV without unnecessary quotes, you can disable it.