I am trying to implement Azure Data factory DATAFLOW to create CRM entity records into multiple entities in single transaction. If any error occurred in the second entity, then first entity record should be rollback. Please share your idea.
I tried with Json file as input with multiple hierarchy, representing multiple CRM entity. I used Data flow source json dataset and 3 CRM sinks. But, i am unable to achieve single transaction when an error occurred.
ADF does not support roll back option. You can have any Watermark column or flag in the target table which indicates the records which got inserted during the current pipeline run and delete only those records if any error occurred.
Watermark column is the column which can have the timestamp at which the row got inserted or it can be incrementing key. Before running the pipeline, maximum value of the watermark column is noted. Whenever the pipeline is failed, rows inserted after the maximum watermark value can be deleted.
Instead of deleting all records from current pipeline run, if records which are not copied in some entities only need to be deleted then based on the key field, we can delete the rows. Below is the approach.
img1: entity1 data
img2: entity2 data
source1@id = source2@id
.img3: exists transformation settings
delete if
true()
.img4: Alter Row transformation settings
id
.Img5: Sink settings
img6: Sink data preview.
When pipeline with this dataflow is run, all rows which are in entity1 but not in entity2 are deleted.