Memcache Consistent Hashing, Cluster, PHP code, Ketama and all about it

8.1k views Asked by At

I have been trying for the whole day to understand and code for Memcache with PHP but I am getting confused at few points. I have gone through many articles and almost every SO questions related this but could not find exact answers.

1) What would be the code to create Consistent Hashed Key in PHP? What libraries I have to install and what I really need to do? Any good article to go through?

2) Suppose, I have successfully stored a Consistent Hashed Key, now if my any of server is down or added a new server would it make any difference even though I am using Consistent Hashed Key etc?

3) Will using Memcached::addServers() instead of Memcached::addServer() make any difference in the case of Consistent Hashing as stated in http://ru.php.net/manual/en/memcached.addserver.php if not then what means?

$m = new Memcached();
$m->setOption(Memcached::OPT_DISTRIBUTION, Memcached::DISTRIBUTION_CONSISTENT);
$m->addServers($servers);

4) Is using above code is enough for Consistent Hashing and then adding/removing servers would not make any difference to keys?

5) What is Ketama Library? and why to use it if Memcached::DISTRIBUTION_CONSISTENT can work better? following http://www.last.fm/user/RJ/journal/2007/04/10/rz_libketama_-_a_consistent_hashing_algo_for_memcache_clients

6) Do I have to hash my keys in some way or just provide my key and let the Memcached handle the rest?

Please guys I need your real support to understand and implement it my production environment as soon as possible. Your answers would let me understand what should I code for better.

2

There are 2 answers

7
Abdul Jabbar On BEST ANSWER

Well those are many questions in once let me try my best to answer one by one.

1) What would be the code to create Consistent Hashed Key in PHP? What libraries I have to install and what I really need to do? Any good article to go through?

Well as you put your code in question this code is enough for Consistent Hashing in PHP. You just need to use your LibMemcached Client library to use Consistent Hashing with Memcached. Just add the following line below your code

$m->setOption(Memcached::OPT_LIBKETAMA_COMPATIBLE, true);

It is highly recommended to enable this option if you want to use consistent hashing, and it may be enabled by default in future releases. Follow this for bunch of Constants and their definition for better understanding http://www.php.net/manual/en/memcached.constants.php

Though a better aproach can be for more better performance by setting globally in php.ini (I haven't tested yet.)

memcache.hash_strategy = consistent;

as suggested in http://blog.fedecarg.com/2008/12/24/memcached-consistent-hashing-mechanism/ then you don't need to specify individually in each memcached calling. Default value is Standard and that uses Modulus calculation which will not be helpful if you add or remove servers.

2) Suppose, I have successfully stored a Consistent Hashed Key, now if my any of server is down or added a new server would it make any difference even though I am using Consistent Hashed Key etc?

Though as the lsmooth said There will always be an impact when servers are removed or added but minimal as suppose adding 1 server into 3 servers it will approximately 1/4 = 25% loss of keys so as many servers as less the chances of losing keys.

3) Will using Memcached::addServers() instead of Memcached::addServer() make any difference in the case of Consistent Hashing as stated in http://ru.php.net/manual/en/memcached.addserver.php if not then what means?

As said by Ismooth. He is correct. It's preferred to use addServers. Note that «addServers()» must be called after all options are set with setOption, else the options won't apply to those servers.

4) Is using above code is enough for Consistent Hashing and then adding/removing servers would not make any difference to keys?

Already answered in Question No 1 answer.

5) What is Ketama Library? and why to use it if Memcached::DISTRIBUTION_CONSISTENT can work better? http://www.last.fm/user/RJ/journal/2007/04/10/rz_libketama_-_a_consistent_hashing_algo_for_memcache_clients

LibKetama is the library on which Consistent Hashing key distribution algorithm is based on. So, using Consistent Hashing in Memcached means using LibKetama and that's what it is.

6) Do I have to hash my keys in some way or just provide my key and let the Memcached handle the rest?

Said by Yvan as "Memcached client will hash your keys automatically. Say you have 3 servers, A, B and C and 3 keys «K1» to «K9». For example, the client hash algorithm will store as follow : K1/K2/K3 stored on A, K4/K5/K6 stored on B, K7/K8/K9 stored on C. If your server B goes down, its keys (K4/K5/K6) will be stored evenly on the 2 remaining servers (A and C). For example, K4 will go to A, and K5/K6 will go to server C.

That's just an example, not the real algorithm. You can find out which key goes on which server with the function $memcached->getServerByKey( 'K4'). Then make one server go down, and see what the getServerByKey() sends you after this failure". at http://www.dugwood.com/895442.html#dwCmtForm.

8
lsmooth On

Consistent Hashing is supported by PHP's memcached extension. You don't have to do anything except make use of it in your code like this:

<?php
  $servers = array(
    array('memcache1.example.com', 11211),
    array('memcache2.example.com', 11211)
    );
  $m = new Memcached();
  $m->setOption(Memcached::OPT_DISTRIBUTION, Memcached::DISTRIBUTION_CONSISTENT);
  $m->addServers($servers);
?>

When you then start adding items to the cache, the extension distributes them to the servers automatically so that it minimizes cache losses in case you add server(s). In case it cannot retrieve an item from a server where it's supposed to be - because the server is down for example - you will have to handle that in your php-code yourself.

Using addServers instead of addServer makes no difference for Consistent Hashing. As stated in the documentation you should use addServers though when adding multiple servers so that the internal datastructures are only updated once.

PHP's implementation of Consistent Hashing is based on libketama it does not need libketama at all. The extension takes care of distributing the items to the different servers for you so that it minimizes cache losses. There will always be an impact when servers are removed or added.