I'm trying to understand the fillfactor in case of btree indexes. From the postgres documentation:

For B-trees, leaf pages are filled to this percentage during initial index build, and also when extending the index at the right (adding new largest key values).

Firstly, what does it mean extending the index at the right? Does it means split the rightest leaf in the B-tree? (images would be helpful).

Why the fillfactor is taken in consideration in this case, and is not taken in consideration only when the index is created?

As far as I know, postgresql will take in consideration the fill factor during the creation. For example, with a fill factor = 50%, right after index creation, the leaves will be 50% full at the most, then, with new insertions, this parameter is not respected (expected for this "right extension").

1

There are 1 answers

0
Laurenz Albe On

The rightmost leaf page of a B-tree index is the one with the largest value in it:

                        +---------+
                        |root page|
                        +---------+
                        /          \
                       /            \
           +----------+              +----------+
           |inner page|              |inner page|
           +----------+              +----------+
            /        \                 /       \
           /          \               /         \
+-----------+  +-----------+  +-------------+  +-------------+
|1 3 ... 100|  |101 ... 900|  |1000 ... 2000|  |2000 ... 5000|
+-----------+  +-----------+  +-------------+  +-------------+
  leaf page      leaf page       leaf page    rightmost leaf page

Since it is a common case that new values are inserted into the rightmost page, for example with auto-incrementing sequence values or with ever-increasing timestamps, insertions there are treated differently, much like when an index is created. The rationale is that such insertions are usually caused by INSERTs rather than by UPDATEs.

If the rightmost page were treated like all others, then auto-incremented primary keys or time series would often create a densely packed index, regardless of the fillfactor setting. Then the first UPDATE that is not HOT and creates a new index entry would cause an index page to be split, which is expensive. This would defy the concepf of setting a fillfactor below 100 to prevent excessive page splitting and fragmentation.

With INSERTs that are all over the place, like UUIDs for a primary key, many leaf pages will automatically not be filled up completely.