This query finds the largest country by area in each continent. It shows the continent, name, and area.
SELECT
c1.continent,
c1.name,
c1.area
FROM
countries c1
WHERE
c1.area = (
SELECT
MAX(c2.area)
FROM
countries c2
WHERE
c1.continent = c2.continent
)
If I'm understanding correctly the subquery will match each entry's continent from c1 to all c2 entries with the same continent and find the max. And if so the subquery will be run for every entry in c1. Doesn't that seem unnecessary?
Can someone please explain the sequence of events that happens in this query and if there is a better way to do this. Thanks.
You need to understand that SQL queries are non-procedural: They describe the result set, not how to create it.
If an index is available on
countries(continent, area)
, then the "subquery" is simply a look up in an index -- a very efficient operation.If an index is not available, then some database engines are smart enough to implement the subquery as a type of join operation -- typically using a hash join if the data is large enough.
Admittedly, some databases would run the subquery for each row in the outer query, essentially doing a nested loop join. But there is no requirement that the engine do this.