data lineage and provenance of airflow pipeline

42 views Asked by At

Imagine that I have 3 tasks on an airflow dag : extract : will request a list of json objects from an api transform : will transform some objects and transform others to a specific json schema load : will send the transformed objects to another api

For tracking these different objects, I want to store each of them in a database. I want to how informations on each object how it was created, what were the objects used for that, the task name...etc. I want also the visualise the pipeline at the object level, meaning having a graphical interface in which I can see all the history of the object.

Do you have any suggestions of some design patterns to do so? there are tools for that? which database and visualisation tool do you suggest?

Thanks!

I have searched on Google some tools to use them, but didn't find anything specific for my needs. I need suggestions for specialist on data

0

There are 0 answers