What is the Correct order to restart a cluster for point-in-time restore?

141 views Asked by At

I have a mixed workload cluster across multiple datacenters. I have ran the sstableloader command for the tables I want to restore using snapshots which I had backed up. I have added commit log files which I had backed up from archive to a restore directory on all nodes. I have updated the commitlog_archiving.properties file with these configs. What is the correct way and order to restart nodes of my cluster? Do these considerations apply for restarting as well?

2

There are 2 answers

0
Erick Ramirez On BEST ANSWER

As a general rule, we recommend restarting seed nodes in the DC first before other nodes so gossip propagation happens faster particularly for larger clusters (arbitrarily 15+ nodes). It is important to note that a restart is not required if you restored data using sstableloader.

If you are just performing a rolling restart then the order of the DCs does not matter. But it matters if you are starting up a cluster from a cold shutdown meaning all nodes are down and the cluster is completely offline.

When starting from a cold shutdown, it is important to start with the "Analytics DC" (nodes running in Analytics mode, i.e with Spark enabled) because it makes it easier to elect a Spark master. Assuming that the replication for Analytics keyspaces are configured with the recommended replication factor of 3, you will need to start 2 or 3 nodes beginning with the seeds ideally 1 minute apart because the LeaderManager requires a quorum of nodes to elect a Spark master.

We recommend leaving DCs with nodes running in Search mode (with Solr enabled) last as a matter of convenience so that all the other DCs are operational before the cluster starts accepting Search requests from the application(s). Cheers!

0
Aaron On

If you've done all of that, I don't think the order matters too much. Although, you should restart your seed nodes first, that way the nodes in the cluster have a common cluster entrypoint to find their way back in and correctly rejoin.