Riak unrecoverable disk failure

165 views Asked by At

I have a 3 node Riak cluster with each having approx. 1 TB disk usage. All of a sudden, one node's hard disk failed unrecoverably. So, I added a new node using the following steps:

1) riak-admin cluster join 2) down the failed node 3) riak-admin force-replace failed-node new-node 4) riak-admin cluster plan 5) riak-admin cluster commit.

This almost fixed the problem except that after lots of data transfers and handoffs, now not all three nodes have 1 TB disk usage. Only two of them have 1 TB disk usage. The other one is almost empty. This means there are no longer 3 copies on disk anymore. What commands should I run to forcefully make sure there are three replicas on disk overall without waiting for read-repair or anti-entropy to make three copies ?

1

There are 1 answers

0
RiakUser On

Answer got by posting same question to [email protected] :

(0) Three nodes are insufficient, you should have 5 nodes (1) You could iterate and read every object in the cluster - this would also trigger read repair for every object (2) - copied from Engel Sanchez response to a similar question April 10th 2014 ) * If AAE is disabled, you don't have to stop the node to delete the data in the anti_entropy directories * If AAE is enabled, deleting the AAE data in a rolling manner may trigger an avalanche of read repairs between nodes with the bad trees and nodes with good trees as the data seems to diverge.

If your nodes are already up, with AAE enabled and with old incorrect trees in the mix, there is a better way. You can dynamically disable AAE with some console commands. At that point, without stopping the nodes, you can delete all AAE data across the cluster. At a convenient time, re-enable AAE. I say convenient because all trees will start to rebuild, and that can be problematic in an overloaded cluster. Doing this over the weekend might be a good idea unless your cluster can take the extra load.

To dynamically disable AAE from the Riak console, you can run this command:

riak_core_util:rpc_every_member_ann(riak_kv_entropy_manager, disable, [], 60000).

and enable with the similar:

riak_core_util:rpc_every_member_ann(riak_kv_entropy_manager, enable, [], 60000).

That last number is just a timeout for the RPC operation. I hope this saves you some extra load on your clusters. (3) That’s going to be : (3a) List all keys using the client of your choice (3b) Fetch each object

https://www.tiot.jp/riak-docs/riak/kv/2.2.3/developing/usage/reading-objects/

https://www.tiot.jp/riak-docs/riak/kv/2.2.3/developing/usage/secondary-indexes/