Cassandra not balancing data over existing nodes in cluster

1.6k views Asked by At

Greeings, I have configured 3 node Cassandra 1.2.12 cluster and I am able to connect to master and create keyspaces and tables over all nodes. However, I want to run YCSB over my cluster so when I run YCSB and load data it is all loaded on Master. Since I am loading 1000000 records I calculated initial tokens by dividing that number by number of nodes I have. When I run nodetool I get something like:

Address    Rack    Status    State    Load    Owns    Token
10.3.2.8   2       Up        Normal   1.08GB  100%    0
10.3.1.231 2       Up        Normal   67.58KB  0%     330000
10.3.1.128 2       Up        Normal   52.79KB  0%     660000

Did someone had same problem? I have tried using tokengentool to assign tokes and diffrenet partitions (Murmur3 and Random) and it was all same, just loading all data on Master node.

Regards, Veronika.

1

There are 1 answers

3
Aaron On BEST ANSWER

A "row" does not equal a token in Cassandra. Regardless of the number of rows you intend to store, Cassandra's RandomPartitioner supports 2^127 tokens. For a 3-node cluster, those initial tokens should be increments of 56,713,727,820,156,410,577,229,101,238,628,035,242 apart from each other.

Using DataStax's Python script for computing initial tokens, these RandomPartitioner values should work for you:

node 0: 0
node 1: 56713727820156410577229101238628035242
node 2: 113427455640312821154458202477256070485

If you are using the Murmur3 Partitioner (-2^63 to +2^63 tokens), use these values:

node 0: -9223372036854775808
node 1: -3074457345618258603
node 2: 3074457345618258602

So at this point, you have two choices:

1 - Decommission 10.3.1.231 and 10.3.1.128, stop the nodes, alter their initial_token values to match what I have above, and restart them. But given the fact that you have mentioned trying both the Murmur3 and RandomPartitioner, I'm thinking that it might be best for your to go with option #2 below.

2 - Stop all nodes, delete your data, follow these instructions, and reload your data.

Also, you may want to adjust the replication factor you defined for your keyspace(s). For a 3-node cluster, you will want a replication factor of at least 2. This will ensure that if one server goes down, you will still have one copy of the data out there. And that should still allow your application to resolve (as long as your read/write consistency is set to ONE).