Rebalancing a table shard, with MySQL/InnoDB

1.9k views Asked by At

I have a huge InnoDB table (>1TB, >1B rows) that I would like to shard: I would like to make multiple smaller independent tables from that big table.

How to do that ?

What I've already tried:

  • Moving rows to partitions by SELECTing them from the original table, and INSERTing them to the partitions. This takes ages, and keeping rows in sync during the operation is hard (but seems to be doable with triggers, as long as the partition is on the same server). I haven't found a ready to use tool to do that.
  • Copying the entire table and then deleting the rows that do not belong to the partition. Still very slow, especially given the size of the table. This is what MySQL Fabric does, apparently.

Random crazy ideas:

  • Splitting the .idb file offline, and importing it into the server, but I don't know if a tool capable of doing this exists.

SHOW CREATE TABLE:

CREATE TABLE `Huge` (
  `account_id` int(11) NOT NULL,
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `foo` varchar(255) NOT NULL,
  `bar` int(11) NOT NULL,
  `baz` char(2) NOT NULL,
  PRIMARY KEY (`account_id`,`id`),
  UNIQUE KEY `id` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED;

Sharding key would be account_id. The primary key currently is (account_id,id) so that rows are clustered by account_id.

Is there any tool automating this task ? Is there any better way of doing this ?

2

There are 2 answers

28
Rick James On

"Sharding" is splitting data (usually one table) across multiple servers. "Partitioning" is splitting a table into multiple subtables on the same server. Which are you doing? Fabric does sharding. Your comment about .ibd implied you are thinking about partitioning.

Assuming that you want to PARTITION one huge table, I need to first ask why. This is a serious question because most (I do mean most) people think that partitioning will magically produce some benefit, when in reality it won't. I believe there are only 4 use cases for partitioning. Does your situation fall into any of them?

If, on the other hand, you want Sharding, then please provide SHOW CREATE TABLE and discuss which column you would shard on.

Edit (after getting clarity on goals)

I hope you do not have explicit FOREIGN KEYs; they will not work with either partitioning or sharding.

`id` bigint(20) NOT NULL AUTO_INCREMENT,
UNIQUE KEY `id` (`id`)

is a problem in two ways.

  • There is no need to make id UNIQUE; the only requirement for an AUTO_INCREMENT is that it be the first column in some index. So this would be less burden on the system:

    INDEX(id)

  • An AUTO_INCREMENT does work for PARTITIONed table, but it does not work for a sharded table. You will need to assess the purpose of id. Either it is OK for it to be unique only within one shard, then there is no real problem. If id needs to be unique across all shards, that is more difficult. Probably the only solution is to have a technique for getting id(s) from some central server. However, that leads to a single-point-of-failure and a bottleneck. If you need to go that direction, I can advise on how to avoid those drawbacks.

How to migrate to the ultimate, sharded, system?

I do not recommend any digression into PARTITIONing, it won't really help in the long run. And the cost of REORGANIZE PARTITION is high -- copying all the rows over, both the extracted rows and the remaining rows. Edit: If you do use partitions, use pt-online-schema-change to do the splitting with minimal downtime.

Instead, I recommend perfecting a tool that will migrate one account_id from one shard to another. This has to be custom code, because there are probably other tables (and references between tables) that will be affected by moving an account to a different server. In the long run, this will be useful for load balancing, hardware upgrades, software upgrades, and even schema changes. When you need to change something, create a new shard(s) with the new OS/version/schema/whatever and migrate users to it.

The simple approach to this tool is

  1. "Block" writes for that one account
  2. Copy the records to the new shard
  3. Change the gatekeeper to know that that account is now on the new shard
  4. Unblock writes
  5. Eventually (and gradually) DELETE rows on the old shard

This is not much of a problem if an account is "small". But if you need minimal downtime (writes blocked), then we can discuss a more complex approach.

(In case you have not guessed, I have 'been there, done that'.)

0
Ormoz On

You could modify your table structure. This table is not 2NF because id is unique (candidate key) while appeared in the primary key (Any other attributes of the table like foo and account_id depend on a subset of the primary key -i.e. id) . The following could do the same job with fewer constraints:

 id bigint(20) not null auto_increment primary key

Now, by creating an index on account_id you could get all benefits of current primary key(account_id,id).

As a second suggestion, you can split the table into two parts: one part containing foo and the other for the rest of columns. This way, you will have a relatively small table (second table) with fixed row length (and hence faster) that stores most of the data (columns), and a variable row length table that is smaller than current table and will be called less frequently.

In summary, before partitioning the table, I suggest you to split it into:

CREATE TABLE `fixed_length` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT primary key,
  `account_id` int(11) NOT NULL,
  `bar` int(11) NOT NULL,
  `baz` char(2) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED;

and

create table `variable_length`(
  `id` bigint(20) NOT NULL primary key,
  `foo` varchar(255) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED;

and an index for account_id:

 create index ix_account_ix on fixed_length(account_id);

Now, if you want to partition data by account_id, you could keep fixed_length intact and do the partitioning only on the variable_lenth table (by whatever method you choose).