I am using Microsoft Fabric for an ETL project in my internship. The project involves transferring data from a SharePoint folder to a data warehouse using Data Flow Gen 2, and subsequently putting the data from this warehouse into Power BI Desktop. I have two datasets with the same data structure, but one is for sales and the other is for purchases. Both datasets are in different subfolders on SharePoint. I would like to know if in the future, when adding new data to the respective folders, the dataflow can recognize which dataset to aggregate based on the difference in folders, despite both having the same structure. Thank you
I ve tried doing some Transformations in the power query in the Data Flow
The new dataset can be recognized with the new data. you need to parameterize your pipeline so, it catches the new dataset and the new records.
You can parameterize your folder path to any level to cover the scope of lookups. could be higher folder level or dedicated subfolder level.
Timestamp is one way to catch get the updates.
for example, you can track the last updated timestamp. so you can use it as a reference to the new data Also you can wildcard the file path or use expressions to select files based on naming conventions or patterns
Also you can use event-based triggers to start your pipeline when new files are added.
I would recommend you at the beginning to plan your folder structure and the data ingestion strategy, in order to create seamless pipelines
Hope that help!