I am able to get join elimination to work for simple cases such as one-to-one relations, but not for slightly more complicated scenarios. Ultimately I want to try anchor modelling, but first I need to find a way around this problem. I'm using Oracle 12c Enterprise Edition Release 12.1.0.2.0.
DDL for my test case:
drop view product_5nf;
drop table product_color cascade constraints;
drop table product_price cascade constraints;
drop table product cascade constraints;
create table product(
product_id number not null
,constraint product_pk primary key(product_id)
);
create table product_color(
product_id number not null references product
,color varchar2(10) not null
,constraint product_color_pk primary key(product_id)
);
create table product_price(
product_id number not null references product
,from_date date not null
,price number not null
,constraint product_price_pk primary key(product_id, from_date)
);
Some example data:
insert into product values(1);
insert into product values(2);
insert into product values(3);
insert into product values(4);
insert into product_color values(1, 'Red');
insert into product_color values(2, 'Green');
insert into product_price values(1, date '2016-01-01', 10);
insert into product_price values(1, date '2016-02-01', 8);
insert into product_price values(1, date '2016-05-01', 5);
insert into product_price values(2, date '2016-02-01', 5);
insert into product_price values(4, date '2016-01-01', 10);
commit;
The 5NF view
This first view does not compile - it fails with ORA-01799: a column may not be outer-joined to a subquery. Unfortunately, this is how most of the historized views are defined when I'm looking at the online examples of anchor modelling...
create view product_5nf as
select p.product_id
,pc.color
,pp.price
from product p
left join product_color pc on(
pc.product_id = p.product_id
)
left join product_price pp on(
pp.product_id = p.product_id
and pp.from_date = (select max(pp2.from_date)
from product_price pp2
where pp2.product_id = pp.product_id)
);
Below is my attempt at fixing it. When using this view with a simple select of product_id
, Oracle manages to eliminate product_color but not product_price.
create view product_5nf as
select product_id
,pc.color
,pp.price
from product p
left join product_color pc using(product_id)
left join (select pp1.product_id, pp1.price
from product_price pp1
where pp1.from_date = (select max(pp2.from_date)
from product_price pp2
where pp2.product_id = pp1.product_id)
)pp using(product_id);
select product_id
from product_5nf;
----------------------------------------------------------
| Id | Operation | Name | Rows |
----------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 |
|* 1 | HASH JOIN OUTER | | 4 |
| 2 | INDEX FAST FULL SCAN| PRODUCT_PK | 4 |
| 3 | VIEW | | 3 |
| 4 | NESTED LOOPS | | 3 |
| 5 | VIEW | VW_SQ_1 | 5 |
| 6 | HASH GROUP BY | | 5 |
| 7 | INDEX FULL SCAN | PRODUCT_PRICE_PK | 5 |
|* 8 | INDEX UNIQUE SCAN | PRODUCT_PRICE_PK | 1 |
----------------------------------------------------------
The only solution I have found is to use scalar sub queries instead, like this:
create or replace view product_5nf as
select p.product_id
,pc.color
,(select pp.price
from product_price pp
where pp.product_id = p.product_id
and pp.from_date = (select max(from_date)
from product_price pp2
where pp2.product_id = pp.product_id)) as price
from product p
left join product_color pc on(
pc.product_id = p.product_id
)
select product_id
from product_5nf;
---------------------------------------------------
| Id | Operation | Name | Rows |
---------------------------------------------------
| 0 | SELECT STATEMENT | | 4 |
| 1 | INDEX FAST FULL SCAN| PRODUCT_PK | 4 |
---------------------------------------------------
Now Oracle sucessfully eliminates the product_price table. However, scalar sub queries are implemented differently than joins and they way they are executed simply doesn't allow me to get any acceptable performance in a real world scenario.
TL;DR
How can I rewrite the view product_5nf
so that Oracle sucessfully eliminates both of the dependent tables?
I think you have two problems going on here.
First, join elimination only works for certain, specific situations (PK-PK, PK-FK, etc). It is not a general thing where you can
LEFT JOIN
to any row set that will return a single row for each join key value and have Oracle eliminate the join.Second, even if Oracle were advanced enough to do join elimination on ANY
LEFT JOIN
where it knew it would get only one row per join key value, Oracle does not yet support join eliminations onLEFT JOINS
that are based on a composite key (Oracle support document 887553.1 says this is coming in R12.2).One workaround you could consider is materializing a view with the last row for each
product_id
. ThenLEFT JOIN
to the materialized view. Like this: