Recently I started to do some research on PostgreSQL free space management and defragmentation. I understand that each page on the heap contains a page header, page identifier, free space, and items. When inserting a new tuple, a new item identifier would be inserted into the beginning of free space and the new item data would be inserted at the end of free space.
After using Vacuum
, the item identifier and item data of dead tuples will be removed. If the removed item identifier is in the middle of other idents, there would be a gap between the identifiers. Since usually the new identifiers will be added at the beginning of free space, will this freed space in between ever be reused again? If so, how can we find this space?
Here is an visual example of this scenario:
There is unused space between (0,3) and (0,5) after removing some tuple. How will this space be reused again? Thanks!
The PostgreSQL technical term for what you call “item identifier” is “line pointer”. The “item pointer” or “tuple identifier” is a combination of page number and line pointer (the
(0,5)
in your image).This indirection is awkward at first glance, but the advantage is that the actual tuple data can be re-shuffled any time to defragment the free space without changing the address of the tuples.
The line pointers form an array after the page header. When a new tuple should be added to the buffer, any free line pointer can be used. If there is no free line pointer, a new line pointer is added at the end of the array. For reference, see
PageAddItemExtended
insrc/backend/storage/page/bufpage.c
.