I have an array in SQL containing all orders and all products from those orders and I need to find all pairs of products that have never been ordered together in one order.
I started with something like this
SELECT a.orderid, a.productid, b.OrderID, b.ProductID from [Order Details] a cross join [Order Details] b
except
SELECT a.orderid, a.productid, b.OrderID, b.ProductID from [Order Details] a cross join [Order Details] b
where a.ProductID=b.ProductID
but I don't know how to eliminate rest of the possible pairs so that I have only the ones I need left.
EDIT: Modified the query a bit, went for a diffrent approach. I'm getting closer but still don't know how to get rid of repetitions like ab and ba
select p1.productid, p2.productid from products p1 join products p2 on p1.productid<>p2.productid
except
select a.productid, b.productid from [Order Details] a join [Order Details] b on a.ProductID<>b.ProductID
where a.OrderID=b.OrderID
Use a left join and filter on missed joins:
This works because missed joins have all nulls in the row, and while the join condition is applied during the join, the where clause is applied after the join, so specifying that a joined column (that can't in reality be null) is null leaves only missed joins.
Another small but important point is the use of less than rather than not equals in the join condition when joining the product table to itself, which prevents products joining to themselves, but importantly prevents two products joining to themselves twice - which ultimately means more efficiency (half as many joins) and not having to use
distinct
to remove duplicate combinations.