I am working on data warehousing project, Need help with below
OLAP Table:
Product Dimension Table: Product_id, category_id, category_name,brand_id, brand_name ,manufacturer_id, manufacturer_name
OLTP Tables: Each table contains create_ts and update_ts for tracking creation & update in tables.
**Product_info, id, product_name,category_id,brand_id,manufacturer,create_ts, update_ts
Product_category_mapping: id,product_id,category_id,create_ts, update_ts
brand: id, name,create_ts, update_ts
manufacturer:id, name,create_ts, update_ts**
Looking to track all the changes in any of the tables, should reflect in the dimension table.
For Example:
Current OLAP Snapshot
Product_id, category_id, category_name,brand_id, brand_name ,manufacturer_id, manufacturer_name 1,33,Noodles,45, Nestle,455,nestele_pvt_ltd
Suppose brand name changes from nestle to nestle-us, How will we track this as we are capturing changes based on only product_info update_ts??
Should we consider all the 4 table changes??
Please suggest.
if data changes in any table that is a source for your DW then you need to include it in your extract logic.
For reference data like this where you can have a number of tables that contribute to a single "target" table, an approach I often take is to create a View across these tables in your source DB, include all the columns you need to take across to the DW but only have a single update_ts column that is calculated using the SQL GREATEST function where you pass in the update_ts columns from all the tables in the View. Then you only need to compare this single column to your "last extracted date" to determine if there are any changes that you may need to process