Optimize a query using multicolumn indices

83 views Asked by At

Say you want to optimize a query in a postgres database like:

SELECT DISTINCT ON (first)
    first,
    second,
    third
FROM my_table
WHERE
    second > 100
    AND fourth = 3
ORDER BY
    first,
    second DESC,
    third DESC

(EDIT: In this example, let's say fourth=3 is about 25% of rows, and second > 100 is only around 5%)

Where you want to select the first column for a table based on a couple filter conditions and ordering by three other conditions. As far as I know, the best way to do this would be to create an index on first and second, then an index on first, second DESC, third DESC. Unfortunately, the second index doesn't seem to be used when I analyze the query.

Is this the ideal way to create these indexes, or could there be a single index that unifies both the filtering and sorting.

Secondly, I'm wondering, is there a way to ensure you're picking the best index strategy given a table, or if this could be analytically determined based on your dataset and query?

When I run this now, this is my current output from explain:

 Unique  (cost=207985.91..208536.43 rows=18682 width=78) (actual time=823.330..965.769 rows=5248 loops=1)
   ->  Sort  (cost=207985.91..208261.17 rows=110104 width=78) (actual time=823.328..935.933 rows=348232 loops=1)
         Sort Key: first, second DESC, third DESC
         Sort Method: external merge  Disk: 31176kB
         ->  Index Scan using ix_my_table_second_fourth on my_table  (cost=0.44..193872.52 rows=110104 width=78) (actual time=0.017..103.031 rows=348232 loops=1)
               Index Cond: ((fourth = 3) AND (second > 100))
 Planning Time: 0.315 ms
 Execution Time: 971.174 ms

So you can see it uses the ix_my_table_second_fourth to filter, but a significant majority of the time is spent sorting the query so the value with the highest second and third value for each first column is attained.

2

There are 2 answers

1
Erwin Brandstetter On

All guesswork, based on incomplete information.

Server configuration

You currently suffer from insufficient work_mem as indicated by the mention of 'Disk' in you query plan.

Sort Method: external merge Disk: 31176kB

Increase the setting (at least locally for the big query) by at least 32 MB - until 'Disk' goes away. Consider discussion at the bottom of this answer:

Then again, you probably won't need more work_mem for the following solution.

Generally, insufficient work_mem indicates one of two things: Insufficient RAM and/or sub-optimal DB design, or potential for server configuration.

Your row estimates are off by factor 3. More aggressive autovacuum settings might help. And no index-only scan in your plan. Maybe you don't tap into the full potential of a covering index. Run a manual VACUUM ANALYZE tbl, and check the effect on your query before doing anything else.

Generic index for the generic query

For only few duplicate values on first in the selection, DISTINCT ON is probably the best query technique, and it should be most efficient to optimize the index for row selection rather than pre-sorting:

CREATE INDEX ON tbl (fourth, second DESC NULLS LAST) INCLUDE (first, third);

second > 100 is more selective than fourth = 3, but that doesn't matter for the order of index columns as long as both are in the lead. The deciding factor: equality comes before range predicates. The column first and third do not contribute to filtering, so those may as well move into the INCLUDE section to make the index a bit smaller.

Related:

Optimized index

Assuming from your comments that the filter fourth = 3 is constant, and the lower bound in second > 100 is steadily increasing. (A timestamp, really, filtered on current date.) So consider this partial index:

CREATE INDEX ON tbl (first, second DESC, third DESC)
WHERE fourth = 3 AND second > 100

Now we are aiming at pre-sorted data.

You didn't disclose the selectivity of the combined filters. The higher, the better for a "partial" approach.

With increasing bound (second > 101 etc.) the index keeps being applicable, but more and more rows have to be filtered over time. So recreate the index with increased bound from time to time. (Like once a month?) Can be automated in a cron job or similar. Use CREATE INDEX CONCURRENTLY and DROP INDEX CONCURRENTLY to minimize friction with concurrent write access (if needed).
Since we are going to use a partial index anyway, adding the second filter on a moving target helps, even if the index is never recreated.

Only being used in the WHERE clause, fourth can now be removed from the index expressions, and index-only scans are still supported. This makes the index smaller (probably; again, missing information) which helps overall - especially with a work_mem bottleneck.

Optimized query

Your original query should already benefit. Notably, if the table is vacuum'ed enough, you get index-only scan. But to exploit the full potential of the partial index, emulate an index skip scan:

Cardinalities in your query plan indicate 66 dupes on avg. per first value in the selection (rows 348232rows 5248). Enough to test this approach:

-- explain
WITH RECURSIVE cte AS (
   (
   SELECT first, second, third
   FROM   tbl
   WHERE  second > 100
   AND    fourth = 3
   ORDER  BY first, second DESC, third DESC
   LIMIT  1
   )

   UNION ALL
   (
   SELECT t.first, t.second, t.third
   FROM   cte c
   JOIN   tbl t ON t.first > c.first
   WHERE  t.second > 100
   AND    t.fourth = 3
   ORDER  BY t.first, t.second DESC, t.third DESC
   LIMIT  1
   )
   )
TABLE cte;

In combination with my partial index, you should see index (or index-only) scans, and no sort step in the query plan at all. Should out-perform the rest.

See:

0
jjanes On

The best index would probably be a partial one, but since the constant which "second" is tested against is apparently not fixed that wouldn't be feasible.

To avoid the (unusually expensive) sort, you could have an index which supports the ORDERing directly. You already report having an index on (first, second DESC, third DESC), which would support the ORDER BY but that index is needlessly inefficient as it provides no efficient way of removing the wrong values of "fourth", nor does it support an index-only scan. A better index for avoiding the sort would be:

create index on my_table (fourth, first, second DESC, third DESC)

This gets to jump directly to the correct value for "fourth", and then scan in the correct order within those. It still needs to filter out the wrong values on the "second" condition one by one, but at least it can do that as an index-only scan rather than needing to hop all over the table. (It would of course be better to jump to the correct value of the more selective "second", but since that is an inequality test there is no single correct value; so that is not possible to do while maintaining the ordering property.)

Based on your existing plan not using an bitmap scan despite returning many thousands of rows from the index, I am assuming your table data is highly correlated on "fourth". By not including "fourth" in your ordering index, you get the double whammy that it can't use an index-only scan, and with not being the first column it doesn't have good locality and so it has to jump all over the table. (Or at least I suspect the planner thinks it will have to).

If it won't naturally choose my proposed index, or if you want to force the use of your existing ordering index just to see what would happen if it did choose it, you should be able to do that by doing set enable_sort=off in the session before running the query.

Secondly, I'm wondering, is there a way to ensure you're picking the best index strategy given a table, or if this could be analytically determined based on your dataset and query?

The planner tries to pick the best index to use. Of course there is no gaurantee it succeeds. Not in general, and especially not when your row estimate are substantially off.

If the index I propose is not good enough, the last resort may be to partition the table on "second". Then it could skip processing the partitions that can't meet the inequality, while still using the ordering index on the remaining partitions with a Merge Append to combine them.