Type of primary key index and what happens on modifying them in MySQL

177 views Asked by At

The highest rated answer here mentions how in MySQL records are stored in the order of primary index. Does that mean the primary index created is a sparse index? And if so, what happens if one modifies the primary key, by changing the column on which it is constructed or modifying one of the entries.

Apologies for asking 2 questions in the same post, but I thought it was better asking them together. I came upon this doubt while dealing with tables which were extremely slow when queried, so I thought maybe inserting rows in some manner which is based on an actual column in the table would help.

1

There are 1 answers

2
Bill Karwin On BEST ANSWER

I'll answer about MySQL's default storage engine, InnoDB. This is the storage engine you should use for all tables in MySQL, unless you have a very specific reason not to.

Answer to Question 1: The primary key is not a sparse index. I know that term to refer to an index that only has values for a subset of rows in the table. I.e. an index with a WHERE clause. But the primary key must account for all rows in the table. It is the column or columns you use if you need to reference any single row uniquely.

Even though the clustered index is stored in order by primary key values, it isn't necessarily stored in that order on disk. Each page in an InnoDB tablespace has a link to the "next" page, which may not be physically located immediately after. It may be much later in the file, or even earlier. The pages may be interspersed with pages for other indexes (even other table if you are using a shared tablespace). Is this what you meant by sparse index?

Answer to Question 2: If you change the column(s) of the table's primary key, the storage of the table must be restructured. InnoDB uses the primary key as the clustered index, which means the rest of the columns are stored along with leaf nodes of the B-tree data structure. Changing the primary key column(s) may change the order of storage, and also the size of each B-tree node (both internal nodes and leaf nodes).

This means while you are restructuring the primary key, your server temporarily needs a lot of extra storage space — up to double the size of the table. Once the restructuring is finished, the old version of the table is automatically dropped.