Indexing the condition of a partial index in Postgres

2k views Asked by At

I'm trying to reason about how Postgres partial indexes are stored inside Postgres. Suppose I create an index like this

CREATE INDEX orders_unbilled_index ON orders (order_nr)
WHERE billed is not true

in order to quickly run a query like

SELECT *
FROM orders
WHERE billed is not true AND order_nr > 1000000

Postgres obviously stores an index on order_nr built over a subset of the orders table as defined by the conditional expression billed is not true. However, I have a couple of questions related to this:

  1. Does Postgres store another index internally on billed is not true to quickly find the rows associated with the partial index?
  2. If (1) is not the case, would it then make the query above run faster if I made a separate index on billed is not true? (assuming a large table and few rows with billed is true)

EDIT: My example query based on the docs is not the best due to how boolean indexes are rarely used, but please consider my questions in the context of any conditional expression.

2

There are 2 answers

0
Laurenz Albe On BEST ANSWER

A b-tree index can be thought of an ordered list of index entries, each with a pointer to a row in the table.

In a partial index, the list is just smaller: there are only index entries for rows that meet the condition.

If you have the index condition in your WHERE clause, PostgreSQL knows it can use the index and doesn't have to check the index condition, because it will be satisfied automatically.

So:

  1. No, any row found via the index will automatically satisfy the index condition, so using the index is enough to make sure it is satisfied.

  2. No, an index on a boolean column will not be used, because it would not be cheaper than this partial index, and the partial index can be used to check the condition on order_nr as well.

    It is actually the other way around: the partial index could well be used for queries that only have the boolean column in the WHERE condition, if there are few enough rows that satisfy the condition.

0
Tim Biegeleisen On

It is my understanding that Postgres will simply build an index which can only be used to lookup records which have billed as not being true. That is, the resulting B-tree would be indexed by the order_nr, but would only link back to the original table when billed be false.

If you keep reading the documentation, immediately after what you cited, you will find the following query as an example:

SELECT * FROM orders WHERE billed is not true AND amount > 5000.00;

It is the case that Postgres might even choose to use the index you defined on the above query. It can use your index to satisfy this query by scanning the entire index. If there are a relatively small number of orders which are not yet billed, then scanning the index on order_nr might still be preferable to doing a full table scan.

So, the answer to your question #1 is that no, there is no separate index for billed, but rather the index on order_nr only can be used for records which have billed set to false. And for #2, yes, a second index on billed is not true could be used assuming few records are unbilled. However, even your current index might even be used as is.