Given this example table
+----+------+-----+--------+
| ID | Name | Age | Exempt |
+----+------+-----+--------+
| 1 | AA | 50 | FALSE |
| 2 | BB | 40 | FALSE |
| 3 | CC | 30 | FALSE |
| 4 | DD | 65 | TRUE |
| 5 | EE | 50 | FALSE |
| 6 | FF | 20 | FALSE |
+----+------+-----+--------+
And this query
SELECT * FROM USERS
WHERE AGE > 40 AND EXPEMPT = TRUE;
I've been told that I could optimize this query having a subquery, but also having an IN statement like this
SELECT * FROM USERS
WHERE AGE > 40 AND ID IN (SELECT ID
FROM USERS WHERE EXEMPT = TRUE);
That could work but I can't understand why the second query is better than the first one in performance.
I can think of no circumstances where the second query would be better than the first. The second requires scanning
users
and then doing additional work with theid
; the first just requires scanningusers
. If the second can take advantage of an index or data partition, the first should be able to use the same index or data partition.If you wanted to speed the query, then an index on
(exempt, age)
might be useful.Do note that performance considerations on a table with 6 rows are generally meaningless -- all reasonable queries should be pretty fast on so few rows.