SQL: find every row with IDs x,y,z where the values exist in no other row

35 views Asked by At

What I have: I have two simple tables, with a many-to-many-relation, and thus a table that links them.

What I want to do Table below: I want to find all towns that George and Joe has been in, but no one else has been in (the real database has thousands of entries in both table 1 and 2)

TABLE_1               LINK_TABLE             TABLE_2
ID   NAME             ID1   ID2              ID  NAME
1    George           1     1                1   New York
2    Joe              1     2                2   Los Angeles
3    Patrick          1     3                3   London
                      2     1                4   Tokyo
                      2     4                5   Paris 
                      2     3                6   Beijing
                      3     1
                      3     5
                      3     6

If I do this:

SELECT DISTINCT ID2 FROM LINK_TABLE WHERE ID1 IN (1, 2)

I get everything George and Joe have been in, but also the ones that Patrick has been in.

If I do this:

SELECT DISTINCT ID2 
FROM LINK_TABLE 
WHERE ID2 IN (
        SELECT DISTINCT ID2 
        FROM LINK_TABLE 
        WHERE ID1 IN (1, 2)
    ) AND ID1 NOT IN (1,2)

I get the ones that Patrick has been in, and that George and Joe also have been in.

Or I can do this:

SELECT DISTINCT ID2 
FROM LINK_TABLE 
WHERE ID1 IN (1, 2) AND ID2 NOT IN (
    SELECT DISTINCT ID2 FROM LINK_TABLE WHERE ID2 IN (
       SELECT DISTINCT ID2 FROM LINK_TABLE WHERE ID1 IN (1, 2)
    ) AND ID1 NOT IN (1,2)) 

buuuuuuut, at that point, it starts looking rather weird, and I also fear it's slow.

Is there any more... readable, and hopefully faster way to solve this in an SQL Query? Here's to hoping I'm just being tired after a day's work...

1

There are 1 answers

3
Joel Coehoorn On

This might work, but the question isn't fully clear:

SELECT DISTINCT ID2
FROM LINK_TABLE lt1
WHERE ID1 IN (1,2)
   AND NOT EXISTS (
      SELECT 1 
      FROM LINK_TABLE lt2
      WHERE lt2.ID2 = lt1.ID2
          AND l2.ID1 NOT IN (1,2)
    )

Hmm... you could also try this and it will likely be even faster:

SELECT ID2
FROM LINK_TABLE lt1
GROUP BY ID2
HAVING  SUM(CASE WHEN ID1 NOT IN (1,2) THEN 1 ELSE 0 END) = 0
    AND SUM(CASE WHEN ID1     IN (1,2) THEN 1 ELSE 0 END) > 0