How will the space be reused after the item identifier is freed in PostgreSQL?

399 views Asked by At

Recently I started to do some research on PostgreSQL free space management and defragmentation. I understand that each page on the heap contains a page header, page identifier, free space, and items. When inserting a new tuple, a new item identifier would be inserted into the beginning of free space and the new item data would be inserted at the end of free space.

After using Vacuum, the item identifier and item data of dead tuples will be removed. If the removed item identifier is in the middle of other idents, there would be a gap between the identifiers. Since usually the new identifiers will be added at the beginning of free space, will this freed space in between ever be reused again? If so, how can we find this space?

Here is an visual example of this scenario:

page after Vacuum

There is unused space between (0,3) and (0,5) after removing some tuple. How will this space be reused again? Thanks!

3

There are 3 answers

0
Laurenz Albe On BEST ANSWER

The PostgreSQL technical term for what you call “item identifier” is “line pointer”. The “item pointer” or “tuple identifier” is a combination of page number and line pointer (the (0,5) in your image).

This indirection is awkward at first glance, but the advantage is that the actual tuple data can be re-shuffled any time to defragment the free space without changing the address of the tuples.

The line pointers form an array after the page header. When a new tuple should be added to the buffer, any free line pointer can be used. If there is no free line pointer, a new line pointer is added at the end of the array. For reference, see PageAddItemExtended in src/backend/storage/page/bufpage.c.

0
jjanes On

The line pointer array is never compacted unless the page is truncated away and later re-added. The unused array slots will be reused as new tuples are added. If the page was once full of a large number of small tuples, then deleted and repopulated with a small number of large tuples, there will be extra unused lp taking up a small amount of space which will never get reused.

You can use heap_page_items from pageinspect with where lp_off=0 to find them.

0
Raj Verma On

Assume that the table contains 2 pages. We take a look at the first page(0th page) Lets assume there is some data inserted and the page has three tuples(rows). Now if lets say we delete the tuple 2 then PG removes Tuple number 2 and reorders the remaining tuples to de-fragmentation, and then updates both the FSM and VM of this page. PostgreSQL continues doing this process till the last page(vacuum).

When you do vacuum the unnecessary line pointers are not removed and they will be reused in future.

Because, if line pointers are removed, all index tuples of the associated indexes must be updated.