I have written a query to take compare of columns/data of 2 database table in oracle which will print records where there is any difference in rows 2 tables based on key columns (fields in partition by clause) and if there is no difference then for any particular field then it will print null and for fields where there is difference then it will print there values (where row_cnt = 2) and if there is some extra record in any of table then it also print it (where row_cnt =1).
Sample script --
create table src_tab (id number,name varchar2(10), job varchar2(10), sal number);
create table tgt_tab (id number,name varchar2(10), job varchar2(10), sal number);
insert into src_tab values (1, 'john', 'not sure', 100);
insert into src_tab values (2, 'kevin', 'unemp', 200);
insert into src_tab values (3, 'chad', 'emplyd', 400);
insert into src_tab values (4, 'page', 'NA', 500);
insert into src_tab values (5, 'emmy', 'desk', 600);
insert into tgt_tab values (1, 'john', 'not sure', 150);
insert into tgt_tab values (2, 'kevin', 'unemp', 200);
insert into tgt_tab values (3, 'chad', 'empl', 400);
insert into tgt_tab values (4, 'page', 'NA', 500);
Comparison query --
Select * from (select id,name,
case NEW_CNT when 1 then 'TGT_TAB' else 'SRC_TAB' end tbl,
ROW_CNT,
case when count(job) over(partition by id,name)
between 1 and count(distinct job) over(partition by id,name)
then job end job,
case when count(sal) over(partition by id,name)
between 1 and count(distinct sal) over(partition by id,name)
then sal end sal
FROM (
select
id,name, job, sal,
sum(NEW_CNT) NEW_CNT, count(*) over(partition by id,name) ROW_CNT
FROM (
select
id,name, job, sal,
-1 NEW_CNT
from src_tab O
union all
select
id,name, job, sal,
1 NEW_CNT
from tgt_tab N
)
group by
id,name, job, sal
having sum(NEW_CNT) != 0
)
order by 1, 2, new_cnt)
;
Since this query is written for oracle 19c database now I require to run this query on Sybase database but when I tried it on Sybase then I found out that COUNT(DISTINCT) windowing function is not allowed there so now looking for alternative ways to print the same output as it is on Sybase I tried using lead/lag function since it is supported on Sybase but the output is not good.
Alternative Tried --
case when sal <>
lead(sal,1,sal) over (partition by id,name order by id) then sal end sal_lead
Expected Result --
ID NAME TBL ROW_CNT JOB SAL
1 john SRC_TAB 2 null 100
1 john TGT_TAB 2 null 150
3 chad SRC_TAB 2 emplyd null
3 chad TGT_TAB 2 empl null
5 emmy SRC_TAB 1 desk 600
We are using both Sybase DB ASE and IQ if the query can be compatible with Sybase-IQ then also it will good. Please let me know if more information is required from my side to understand the query I have written in orcale.
Edit 1: One Alternative I found which is working on Sybase-IQ is as below if someone has more good approach please do comment !!
with t1 as (
select 'SRC_TAB' tbl,
s.*
from src_tab s
union all
select 'TGT_TAB' tbl,
t.*
from tgt_tab t
),
t2 as (
select t1.*,
case max(job) over(partition by id,name)
when min(job) over(partition by id,name) then 1
else 0
end same_job_flag,
case max(sal) over(partition by id,name)
when min(sal) over(partition by id,name) then 1
else 0
end same_sal_flag,
count(*) over(partition by id,name) row_cnt
from t1
)
select id,
name,
tbl,
row_cnt,
case
when row_cnt = 1 then job
when same_job_flag = 0 then job
end job,
case
when row_cnt = 1 then sal
when same_sal_flag = 0 then sal
end sal
from t2
where same_job_flag = 0
or same_sal_flag = 0
or row_cnt = 1
order by id,
name,
tbl;
Edit 2 : Oracle Procedure to Generate Comparison query (query -2, Alternative which I have written). I have written same procedure for my 1st Oracle query as well with COUNT(DISTINCT) one :)
CREATE OR REPLACE PROCEDURE pr_compare_table2(
p_key_coln VARCHAR2,
p_comp_coln VARCHAR2,
p_src_tab VARCHAR2,
p_tgt_tab VARCHAR2,
p_src_whr VARCHAR2,
p_tgt_whr VARCHAR2
)
AS
l_sql CLOB;
l_sql1 CLOB;
l_sql2 CLOB;
l_sql3 CLOB;
l_length NUMBER;
BEGIN
FOR i IN (SELECT REGEXP_SUBSTR(p_comp_coln,'[^,]+',1,LEVEL) value
FROM dual
CONNECT BY LEVEL <= LENGTH(p_comp_coln) - LENGTH(REPLACE(p_comp_coln, ',' )) +1)
LOOP
l_sql := TO_CLOB(l_sql)||'CASE MAX('||i.value||') OVER(PARTITION BY '||p_key_coln|| ')
WHEN MIN('||i.value||') OVER(PARTITION BY '||p_key_coln||' ) THEN 1 ELSE 0 END
AS same_'||i.value||'_flag,';
END LOOP;
l_sql1 := 'WITH t1 AS (
SELECT '''||p_src_tab||''' tbl,
s.*
FROM '|| p_src_tab ||' s '||p_src_whr||'
UNION ALL
SELECT '''||p_tgt_tab||''' tbl,
t.*
FROM '|| p_tgt_tab ||' t '||p_tgt_whr||'
),
t2 AS (
SELECT t1.*,'||TO_CLOB(l_sql)||
'COUNT(*) OVER(PARTITION BY '||p_key_coln ||') row_cnt
FROM t1
) ';
FOR i IN (SELECT REGEXP_SUBSTR(p_comp_coln,'[^,]+',1,LEVEL) value
FROM dual
CONNECT BY LEVEL <= LENGTH(p_comp_coln) - LENGTH(REPLACE(p_comp_coln, ',' )) +1)
LOOP
l_sql2 := TO_CLOB(l_sql2)||'
CASE
WHEN row_cnt = 1 THEN '||i.value||
' WHEN same_'||i.value||'_flag = 0 THEN '||i.value||' END '|| i.value||',';
END LOOP;
l_sql1 := TO_CLOB(l_sql1)||'
SELECT '||p_key_coln||', tbl,row_cnt,
'||TO_CLOB(SUBSTR(l_sql2,1, LENGTH(l_sql2)-1 ))|| '
FROM t2 WHERE 1=1 AND';
FOR i IN (SELECT REGEXP_SUBSTR(p_comp_coln,'[^,]+',1,LEVEL) value
FROM dual
CONNECT BY LEVEL <= LENGTH(p_comp_coln) - LENGTH(REPLACE(p_comp_coln, ',' )) +1)
LOOP
l_sql3 := TO_CLOB(l_sql3)||' same_'||i.value||'_flag = 0 or ';
END LOOP;
l_sql1 := TO_CLOB(l_sql1)||' '||TO_CLOB(l_sql3)||'
row_cnt = 1
ORDER BY '||p_key_coln||';';
l_length := LENGTH(l_sql1);
IF l_length < 32767 THEN
dbms_output.put_line(l_sql1);
ELSE
dbms_output.put_line(LENGTH(l_sql1));
END IF;
END pr_compare_table2;
Background:
If OP's IQ query (with CTEs and windowing functions) generates the expected result then one option OP may want to consider:
I was able to hobble together a couple queries that run in ASE that generate the expected result ...
Adding a few more rows to the data set:
One 4-way union query:
We can cut this down to a 3-way union by using a 2-row dummy/derived table to split a common key match into 2 separate rows:
NOTES:
ROW_CNTvalues are just hardcoded values (the only reasonROW_CNTis generated by these queries is to match OP's expected output)Both of these queries generate the following:
NOTES:
ASE 16.0 SP04 PL04instanceI'm guessing OP's question is a minimal, reproducible example of a bigger project that needs to run a
diffagainst a set tables.If this is the case then it won't take long to figure out that none of the queries presented so far (OP's queries, my queries) 'scale up' very well. All of these queries are hardcoded towards a specific pair of tables, a specific set of columns and a specific set of datatypes.
For a full blown
diffprocess OP will need to take into consideration:casestatements)= NULLandis NULLbehave differently)To build a robust tool will take a good bit of time and effort, especially if trying to build the entire tool in SQL. A few ideas to consider:
awk,perl,ruby,c(++),java)rs_subcmp(OS-level, java-based utility) andData Assurance(OS-level, java-based, licen$$$ed utility))