HBASE Sequential row key (YYYYMMDDHHMMSS), Deterministic Non-Random Salt

296 views Asked by At

My row key's initial start part looks like "YYYYMMDDhhmmss" where 'ss' is always 00. Example: 20170603162100 , which corresponds to 16:21 on 06th June 2017 (Don't ask me why, but the time-stamp has to be at the start of the key!)

This is obviously every minute (and obviously every minute is unique) data.

This suffers from region hot-spotting. Row keys will be like this on a region server:

My read patterns: Get data for a unique minute (not for a hour, a day, a month, a year)

Say I have 10 region servers.

Here is a solution I am thinking of, which looks like kind of a salt(but is deterministic, and not random):

I see the mm Part - minute and assign a salt based on that. 00 minute: prefix A to row key 01 minute: prefix B to row key .. 09 minute: prefix J to row key 10 minute: prefix A to row key

This way all 'A' keys should distribute to first region server, and so forth. The advantages may be : all single minute requests to the same region server, which is bearable for me. And the very next minute, all requests to some other region server.

Also, when retrieving, i won't have to do parallel reads for I actually know the salt.

Can someone explain if I am somewhere wrong?

1

There are 1 answers

2
vvg On

Well, you have just 27 minutes covered with english alphabet, probably I would suggest to use two-letters salt, it still should distribute properly. (How many nodes do you have?).

Alternatively, you can try just to remove seconds from your row-key and reverse it.