Perfomance impact when using lexicographically ordered string for primary key

50 views Asked by At

How much performance hit is when using string primary key in PostgreSQL (or generally in any whell-known DBMS)? String is lexicographically ordered on its first half (as string date) and random in another, ex.: 20231201_A32vhfe

More info:

  • Only equality = conditions will be used (no range searches with LIKE)
  • Reads are much more often than inserts
  • Key will have 15-30 length (non constant length)
  • Table will have ~10M records

Optimization goal is for read, write doesn't matter because quite rare.

Thank you!

PS: as additional question (but not mandatory to choose as answer) what would be write performance impact if:

  • Inserts will be with keys that uses present time in first part of the string - so inserts are generally ordered by date but not ordered inside specific day frame
  • Deletes as frequent as inserts but deletes are from arbitrary position
1

There are 1 answers

1
Laurenz Albe On

It will be a tad slower than an integer, but I doubt that you will notice the difference. Run a benchmark if you want certainty.

You should definitely define the column with COLLATE "C" so that you don't have any overhead from natural language collations.