PostgreSQL better query performance

Question

PostgreSQL better query performance

60 views Asked by Pablo Alejandro At 17 November 2023 at 15:57

I have customers and visits table, and I would like to know which of these two queries has the better performance: (I have indexes defined for those columns)

Query 1

SELECT
  customers.id as id,
  COALESCE(v.count, 0) as visits

FROM
  customers
  LEFT OUTER JOIN (
    SELECT customer_id, count(*)
    FROM visits
    GROUP BY customer_id
  ) as v on visits.customer_id = customers.id

EXPLAIN ANALYZE Result

Query 2

SELECT
  customers.id as id,
  (
      SELECT count(*)
        FROM visits
        WHERE
        visits.customer_id=customers.id
  ) as visits

FROM
  customers

EXPLAIN ANALYZE Result

As you see in the above examples the second query has a lower cost, but the excecution time is higher than the first query.

I'm quite confused about it. I assume that depends on the filter options. Can you help me to understand that, and if you have a better query, please let me know.

Original Q&A

There are 2 answers

**O. Jones** · Answer 1 · 2023-11-17T17:15:27+00:00

In principle your first one, with the aggregating subquery, is faster than the one with the correlated subqquery. That's because the aggregate result set can be evaluated just once then hash-joined to the first table.

But the query planner might be smart enough to handle them the same way.

An index on visits.customer_id will help.

**Erwin Brandstetter** · Answer 2 · 2023-11-18T01:58:48+00:00

I assume that depends on the filter options.

Exactly.

To count visits for all or most customers, the 1st query is substantially faster, as it processes the whole (mostly or all relevant) visits table in one fell swoop. Index not required, but an index-only scan may still help.

To count visits for a small selection of customers, the 2nd query with a correlated subquery is substantially faster, as it only processes the few rows from table visits that are actually relevant, which outweighs the overhead of running a separate aggregate for each customer. Index required.

(Your query plans don't seem to match the given setup.)

See:

Multiple array_agg() calls in a single query

TechQA.

PostgreSQL better query performance

Query 1

EXPLAIN ANALYZE Result

Query 2

EXPLAIN ANALYZE Result

There are 2 answers

Related Questions in POSTGRESQL

Related Questions in PERFORMANCE

Related Questions in QUERY-OPTIMIZATION

Related Questions in POSTGRESQL-PERFORMANCE

Popular Questions

Trending Questions