Exactly-once guarantee in Storm Trident in network partitioning and/or failure scenarios

77 views Asked by At

So, Apache Storm + Trident provide the exactly-once semantics. Imagine I have the following topology:

TridentSpout -> SumMoneyBolt -> SaveMoneyBolt -> Persistent Storage.

CalculateMoneyBolt sums monetary values in memory, then passes the result to SaveMoneyBolt which should save the final value to a remote storage/database.

Now it is very important that we calculate these values and store only once to the database. We do not want to accidentally double count the money.

So how does Storm with Trident handle network partitioning and/or failure scenarios when the write request to the database has been successfully sent, the database has successfully received the request, logged the transaction, and while responding to the client, the SaveMoneyBolt has either died or partitioned from the network before having received the database response?

I assume that if SaveMoneyBolt had died, Trident would retry the batch, but we cannot afford double counting.

How are such scenarios handled?

Thanks.

1

There are 1 answers

0
arunmahadevan On BEST ANSWER

Trident gives a unique transaction id for each batch. If a batch is retried it will have the same txid. Also the batch updates are ordered, i.e. the state update for a batch will not happen until the update for the previous batch is complete. So by storing the txid along with the values in the state trident can de-duplicate the updates and provide exactly once semantics.

Trident comes with a few built-in Map state implementations which handles all this automatically.

For more information take a look at the docs :