I retrieve data from SCD-2 table with many parameters and I need to build my own SCD-2 with only one of them. Therefore, I need to get rid of excessive intervals. Please recommend an algorithm to perform that in the best way.
What I receive from the source table:
I need to transform it to:
You can use following steps to get the required result. Of course you can do it all in one step with sub-selects or CTEs, but for better traceability I prefere temporary tables.
Step 1: Identify start and end for a single value period.
Note that in LAG/LEAD it is essentally to have a value as NULL replacement (-99 in example) which dosn't match with the possible values in the column.
Step 2: Filter on start/end rows and assign row_actual_to of end to start.
If the period of a value has only one row this row has period_start and period_end set to 1 and therefore the sum is 2. In this case the content of row_acutal_to has already the wanted value.
Step 3: Filter (adjusted) start row of value period.