Hi I have a cassandra DB with a huge amount of data and I am using only one node to store it.So someone suggested me to use multiple nodes .
So what will happen if I add a new node , will the data get replicated to the other node or it will distributed equally with the other node ?
I am new to cassandra and DB management. It would help if some can share some thoughts regarding this ...It would be very helpful
Both. Data will get replicated to other nodes depending on the replication strategy and replication factor for each keyspace. But data will also be split across nodes to balance the load. A new node that joins the cluster assumes responsibility for an even portion of the data automatically.
P.S. I'd advise you to run [nodetool cleanup] on the old nodes (http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html) post adding the new nodes. This will help cleanup keys that no longer belong to the old nodes