This might seem silly, but I deleted everything in the data directory (/var/lib/scylla/data/*) on one of the nodes. Now, to bring back data I can run a nodetool repair
or nodetool rebuild
.
I have confusion regarding rebuild
. I read that it is primarily used when creating new datacenter. So,in my case, If I run nodetool rebuild
on my node in the existing cluster, what would happen? Will it simply copy data from the replicas for all the token ranges that it was responsible for? or Will it somehow disturb the token range and cause data loss?
I understand that nodetool repair
will create consistent data copy. But assume that I have no strict requirement for consistency and just want to bring the data back on my node as quickly as possible. Which option is more safe repair
or rebuild
.
It depends on the state of the node.
If you deleted everything including the system tables, the node will try to auto-bootstrap but fail because it can't just re-join without the
replace_address
flag. For this reason, you won't be able to run arepair
since the node won't be up.You can only run
rebuild
if the node is added back into the cluster withauto_bootstrap: false
. That's problematic because in this state the node will accept read requests even though it doesn't have the data. This means that if reads are done with aONE
orLOCAL_ONE
consistency, there won't be any data returned. Cheers!