Should i use Trident to compute the global mean of tuples in Storm?

749 views Asked by At

I want to compute with Storm the mean from incoming tuples made of [int id,int value]. As you can see i can't partition the data by using a fields grouping. I need a topology architecture to distribute this computation and the only way im thinking of is doing mini batches within each bolt instances and then aggregate.

I kind of understood that trident was the appropriate solution to do mini-batch processing within storm.

What is the best practice to compute global analytics with storm like means, global count, std-devs when you can't partition the data based on attribute? Any topology example?

1

There are 1 answers

0
Pierre Merienne On

You can easily compute stream statistics such as mean, standard deviation and count computed using Trident-ML. There's a section in the README which explains how to compute theses stats within a trident topology.

Hope it helps.