Partition data while writing to delta sink

885 views Asked by At

In Azure mapping dataflow we now have option to save files in delta format. But that is only available when we select inline dataset (without data bricks subscription). And when the sink dataset is inline dataset, it does not allow to set partition based on any column.

I can write pyspark code to rewrite the delta table with required partition. But that would incur additional cost.

What could be work arounds for getting good performance on delta data?

1

There are 1 answers

0
Satya V On

There was a UI issue that was recently fixed by the engineering team. Until this reflects at your end.

You could do the following as a workaround :

Option 1 :

You can change the type of sink to something else, like a delimited text sink, and you should then see the key columns in Key partitioning. Then, switch the Sink type back to Delta.

Reference : https://learn.microsoft.com/en-us/answers/questions/599075/index.html

Option 2: You could enable the partitioning at the source end.

enter image description here

The partitioned data was flowing as a stream. I was able to achieve the partitioned data as a result