After some time I figured imho quite fast one-pass merge statement to historize slowly changing dimension type 2.
It works perfectly on tables without unique constraint.
Mostly it works also on tables with unique constraint. But sometimes (usually when the change in history is bigger) I get error ORA-00001, unique constraint violation. Of course, I know there are two-pass methods which work, but they are slower.
My only guess is that Oracle does sometimes the INSERT before UPDATE which duplicates the TH_Valid_To_Date for a moment.
Any ideas how to avoid it (while keeping primary key)?
Source table:
CREATE TABLE TESTS
(
T_Key_1 NUMBER(38,0) DEFAULT -1 NOT NULL ENABLE /* */
,T_Key_2 NUMBER(38,0) DEFAULT -1 NOT NULL ENABLE /* */
,Text_Value VARCHAR2(100) /* */
,Number_Value NUMBER(38,0) DEFAULT -1 NOT NULL ENABLE /* */
,Amount NUMBER /* */
,CONSTRAINT T_PK PRIMARY KEY (T_Key_1, T_Key_2) /* Primární klíč */
)
;
History table:
CREATE TABLE TEST_HISTORY
(
T_Key_1 NUMBER(38,0) DEFAULT -1 NOT NULL
,T_Key_2 NUMBER(38,0) DEFAULT -1 NOT NULL
,Text_Value VARCHAR2(100)
,Number_Value NUMBER(38,0) DEFAULT -1 NOT NULL
,Amount NUMBER
,TH_Valid_From_Date DATE DEFAULT to_date('1000-01-01','yyyy-mm-dd') NOT NULL /* SCD2 - Start of validity of record. */
,TH_Valid_To_Date DATE DEFAULT to_date('3000-01-01','yyyy-mm-dd') NOT NULL /* SCD2 - End of validity of record. */
,CONSTRAINT TH_PK PRIMARY KEY (T_Key_1, T_Key_2, TH_Valid_To_Date) using index local
)
/** Physical Options **************************************************************************************************/
partition by range (TH_Valid_To_Date) interval (NUMTOYMINTERVAL (1, 'MONTH'))
(partition P_10000000 values less than (TO_DATE ('01-01-1000', 'DD-MM-YYYY')))
ENABLE ROW MOVEMENT
;
Merge:
MERGE INTO (SELECT * FROM TEST_HISTORY WHERE TH_Valid_To_Date = to_date('3000-01-01','yyyy-mm-dd')) Hst /*change only current records which are identified by TH_Valid_To_Date = to_date('3000-01-01','yyyy-mm-dd') */
USING (
SELECT * FROM (
SELECT NVL(Src.T_Key_1, Dst.T_Key_1) AS T_Key_1
,NVL(Src.T_Key_2, Dst.T_Key_2) AS T_Key_2
,Src.Text_Value
,Src.Number_Value
,Src.Amount
,CASE WHEN Src.T_Key_1 is null THEN 'D' /*delete*/
WHEN Dst.T_Key_1 is null THEN 'I' /*insert*/
WHEN (Src.Text_Value=Dst.Text_Value OR (Src.Text_Value is null AND Dst.Text_Value is null))
AND (Src.Number_Value=Dst.Number_Value OR (Src.Number_Value is null AND Dst.Number_Value is null))
AND (Src.Amount=Dst.Amount OR (Src.Amount is null AND Dst.Amount is null))
THEN 'X' /*no change*/
ELSE 'U' /*update*/ END AS Operation
FROM TESTS Src
FULL JOIN (SELECT * FROM TEST_HISTORY WHERE TH_Valid_To_Date = to_date('3000-01-01','yyyy-mm-dd')) Dst
ON (Src.T_Key_1 = Dst.T_Key_1 AND Src.T_Key_2 = Dst.T_Key_2)
)
INNER JOIN (SELECT LEVEL AS duplication FROM DUAL CONNECT BY LEVEL BETWEEN 1 AND 2) ON (duplication=1 OR Operation='U') /*need to duplicate update records so that they can go to both matched and not matched parts*/
WHERE Operation<>'X'
) Act
ON (Act.T_Key_1 = Hst.T_Key_1 AND Act.T_Key_2 = Hst.T_Key_2 AND Act.duplication=1 AND Act.operation<>'I')
WHEN MATCHED THEN UPDATE
SET
TH_Valid_To_Date = p_Load_Date - 1,
WHERE Hst.TH_Valid_To_Date = to_date('3000-01-01','yyyy-mm-dd')
WHEN NOT MATCHED THEN INSERT /*+ append */
(
T_Key_1
,T_Key_2
,Text_Value
,Number_Value
,Amount
,TH_Valid_From_Date /*Auditní sloupec*/
,TH_Valid_To_Date /*Auditní sloupec*/
) VALUES (
Act.T_Key_1
,Act.T_Key_2
,Act.Text_Value
,Act.Number_Value
,Act.Amount
,p_Load_Date /*Auditní sloupec*/
,to_date('3000-01-01','yyyy-mm-dd') /*Auditní sloupec*/
)
;
I ran into this same issue myself: you cannot both insert and update a record. Oracle determines whether the record is to be inserted or updated when it first performs the join. Here is a simplified test case that illustrates this: