Token Aware Astyanax Connection pool connecting on nodes without distributing connections over nodes

776 views Asked by At

I was using astyanax connection pool defined as this:

ipSeeds = "LOAD_BALANCER_HOST:9160";
conPool.setSeeds(ipSeeds)
.setDiscoveryType(NodeDiscoveryType.TOKEN_AWARE)
.setConnectionPoolType(ConnectionPoolType.TOKEN_AWARE);

However, my cluster have 4 nodes and I have 8 client machines connecting on it. LOAD_BALANCER_HOST forwards requests to one of my four nodes.

On a client node, I have:

$netstat -an | grep 9160 | awk '{print $5}' | sort |uniq -c
    235 node1:9160
    680 node2:9160
      4 node3:9160
      4 node4:9160

So although the ConnectionPoolType is TOKEN_AWARE, my client seems to be connecting mainly to node2, sometimes to node1, but almost never to nodes 3 and 4.
Question is: Why is this happening? Shouldn't a token aware connection pool query the ring for the node list and connect to all the active nodes using round robin algorithm?

1

There are 1 answers

0
Daniel Schulz On BEST ANSWER

William Price is totally right: the fact you're using a TokenAwarePolicy and possibly a default Partitioner means that - first your data will be stored biased across your nodes and - then on querying the LoadbalancingPolicy makes your driver remember the correct nodes to ask for

You can improve your cluster's performance by using some deviating or may be a custom partitioner to equally distribute your data. To randomly query nodes use either

The latter, of course, needs the definition of data centers in your keyspace.

Without any further information I would suggest to just change the partitioner as a TokenAware load balancing policy is usually a good idea. The main load will end up on these nodes in the end -- the TokenAware policy get's you to the right coordinator just quicker.