If we got rid of INNER JOIN, LEFT JOIN. RIGHT JOIN, and FULL JOIN tomorrow, could we completely replace their functionality with CROSS JOIN and filters on it? I initially thought that we obviously could, but edge cases such as this one have given me doubts.
Is CROSS JOIN all we need?
147 views Asked by J. Mini AtThere are 4 answers
On
No it is not logically "all we need".
An INNER JOIN is logically just a CROSS JOIN with the join predicate evaluated on all rows.
- For left outer joins for any rows in the left table that did not match anything output these with NULL values for the right table columns.
- For right outer joins for any rows in the right table that did not match anything output these with NULL values for the left table columns.
- Full outer joins combine both of these behaviours.
Most of the time if you were presented with a CROSS JOIN result between table A and B and it contained a row identifier for both table A and B then it would contain enough information to calculate what the outer joined result should be for a particular join predicate (though the SQL to do this would be inefficient)
For example for
DECLARE @A TABLE
(
A_ID INT PRIMARY KEY,
A_N INT
)
DECLARE @B TABLE
(
B_ID INT PRIMARY KEY,
B_N INT
)
INSERT @A
VALUES (1, 101),
(2, 101),
(3, 102),
(4, 102);
INSERT @B
VALUES (1, 101),
(2, 101);
Then one way of simulating
SELECT *
FROM @A A
LEFT JOIN @B B ON A.A_N = B.B_N;
would be (DB Fiddle)
WITH A_cross_join_B AS
(
SELECT *
FROM @A A
CROSS JOIN @B B
), A_cross_join_B_with_loj_flag AS
(
SELECT *,
/*2 if Join predicate matches, 1 if we are preserving a single instance of this row for outer join purposes, 0 otherwise*/
Flag = CASE WHEN A_N = B_N THEN 2 ELSE CASE WHEN 1 = ROW_NUMBER() OVER (PARTITION BY A_ID ORDER BY CASE WHEN A_N = B_N THEN 0 ELSE 1 END) THEN 1 ELSE 0 END END
FROM A_cross_join_B
)
SELECT A_ID,
A_N,
B_ID = CASE WHEN Flag = 2 THEN B_ID END,
B_N = CASE WHEN Flag = 2 THEN B_N END
FROM A_cross_join_B_with_loj_flag
WHERE Flag <> 0
The above does rely on all rows from table A and all rows from table B being present in the cross join result.
In general each row from table A will be represented b_card times - where b_card is the number of rows in table B. And similarly each row from table B will be represented a_card times.
As long as both a_card and b_card are >=1 then all source rows are present.
The case where this falls down however is where one of these is 0 - i.e. the join is onto an empty table - then the cross join result is empty and it would be impossible.
JOIN
LEFT JOIN
RIGHT JOIN
FULL JOIN
CROSS JOIN
How are you going get LEFT JOIN from CROSS JOIN?