Efficient scalable sequence generator implementation

5.3k views Asked by At

Problem:
I need to implement multiple shared sequence generators, that will be used by 50-100 Tomcat servers. Each sequence generator should start with 1 and be incremented by one after each request. Sequence generator implementation should have atomic increment command. Java client should be available.

Scale:
Up to 50000 sequence generators actively used, for each of them we expect one increment request in 5-10 seconds. Up to 20000 requests per second
50-100 java clients(Tomcat servers). The access to the sequence generators is shared between those servers. Important: Only 2 clients use the same sequence generator.
50-100 times - average number of times each sequence generator is used.
24 h TTL - a generator should be cleaned at most after 24 hours after it was created - so actually there can be significantly more than 50000 sequence generators, but only up to 50000 are expected to be actively receiving requests at the same time.

Performance:
<1ms preferable average response time. Average over 2ms is definitely not good enough.

We ruled out Oracle's Sequence object.
We are currently considering Redis and Memcached.
Both are fast.

What implementation is recommended/good enough?
Are there any other better technologies for this purpose?

Another important question:
Which has a better performance for incr, Redis or Memcached?

Thanks

2

There are 2 answers

2
Vitaliy On

I have the following suggestion:

The naive approach is problematic since it involves very fine grained locking on the record holding the current sequence value. With so many clients- it creates an unbearable contention.

My suggestion is to allow the client to perform a reservation on a range of sequences. So, for the first time, the client orders a range, say from 1 to 1000. The processing of this order is synchronized. So if another asks for a range at the same time- it should block and later receive the range 1001 to 2000.

Each client gets a disjoint pool of sequences and pulls the values from it. As soon as the pool is depleted, it asks the db for another range. This, as mentioned involves a lock.

The thing is that you can fine tune the size of the range to trade off lock contention and resource utilization (a reserved range is not guaranteed to be fully used- the client may decide it no longer needs any sequences or it may crash altogether).

0
Sripathi Krishnan On

Redis should let you do this easily.

  1. Create a key for each sequence generator
  2. Use the INCR command to get the next sequence number
  3. If the returned number is 1, it means the key did not exist previously. In this case, you should also issue the PEXPIRE command to expire the key after 24 hours

To further reduce the latency, you can do several things -

  1. Maintain a Redis server only for sequence generators. In other words, do not store any other data on this server.
  2. Don't use AOF; instead use the RDB persistence. AOF will append every INCR command in the log file, RDB on the other hand will simply store the current value of each sequence.

You should read through diagnose latency problems as well as the page on redis persistence.

Also, the expire command has to be issued at most once in 24 hours, and that will require two round trips to redis. If you want to avoid that, you can create a lua script. The rate limit lua script is a good starting point.