I have looked at this question and its answer but am still uncertain. My simplified schema is:
CREATE TABLE department(
ident text not null primary key,
name text not null,
building text not null
);
CREATE TABLE staff(
ident integer primary key autoincrement,
name text not null,
foreign key (dept) references department(ident)
);
My query is trying to find all builds that have staff in them:
select name, building
from department
where exists (
select 1
from staff
where dept == department.ident
)
order by name;
- Is this a correlated subquery? I believe so because the inner query has to run once for each department.
- Would it be more efficient to do a join and then filter as shown below?
select distinct
department.name as name,
department.building as building
from department join staff
on department.ident = staff.dept
order by name;
In a good RDBMS like Microsoft SQL Server or Oracle Database those query will produce exactly the same execution plan.
You must remember that a query language is not a programming language. SQL only lets you define the shape of the result and not the way to retrieve the information...
One the query is parsed, it is convert into a formulae of relational algebrae then simplified by mathematical rules like substitution and finaly optimized by semantics and statistics rules to reorder the sequencing of operations and choose the appropriate algorithms.
In fact, the two queries you have written are mathematically strictly equivalent, and must produce the same execuption plan (eg the sequence of minimal operations that must be done to give the result).
As a test in MS SQL Server, this conduct to the final execution plans above:
I think you are on PostGreSQL... So try to get the execution plan with EXPLAIN...