SQL performance optimization: AND vs IN

59 views Asked by At

Given this example table

+----+------+-----+--------+
| ID | Name | Age | Exempt |
+----+------+-----+--------+
|  1 | AA   |  50 | FALSE  |
|  2 | BB   |  40 | FALSE  |
|  3 | CC   |  30 | FALSE  |
|  4 | DD   |  65 | TRUE   |
|  5 | EE   |  50 | FALSE  |
|  6 | FF   |  20 | FALSE  |
+----+------+-----+--------+

And this query

SELECT * FROM USERS
WHERE AGE > 40 AND EXPEMPT = TRUE;

I've been told that I could optimize this query having a subquery, but also having an IN statement like this

SELECT * FROM USERS
WHERE AGE > 40 AND ID IN (SELECT ID 
                          FROM USERS WHERE EXEMPT = TRUE);

That could work but I can't understand why the second query is better than the first one in performance.

1

There are 1 answers

0
Gordon Linoff On

I can think of no circumstances where the second query would be better than the first. The second requires scanning users and then doing additional work with the id; the first just requires scanning users. If the second can take advantage of an index or data partition, the first should be able to use the same index or data partition.

If you wanted to speed the query, then an index on (exempt, age) might be useful.

Do note that performance considerations on a table with 6 rows are generally meaningless -- all reasonable queries should be pretty fast on so few rows.