How many RDD in the resulting DStream of reduceByKeyAndWindow

336 views Asked by Michael Benguigui At 19 June 2015 at 15:48

I am currently working on a small spark job to compute stock correlation matrix from a DStream.

From a DStream[(time, quote)], I need to aggregate quotes (double) by time (Long) among multiple rdds, before computing correlations (considering all quotes of the rdds)

dstream.reduceByKeyAndWindow{./*aggregate quotes in Vectors*/..} 
       .forEachRDD {rdd => Statistics.corr(RDD[Vector])}

To my mind, this could be a solution if the resulting dstream (from reduceByKeyAndWindow) contains only 1 rdd with all aggregated quotes.

But I am not sure. How is the data distributed after reduceByKeyAndWindow? Is there a way to merge rdds in a dstream?

Original Q&A

TechQA.

How many RDD in the resulting DStream of reduceByKeyAndWindow

There are 0 answers

Related Questions in WINDOW

Related Questions in REDUCE

Related Questions in SPARK-STREAMING

Related Questions in RDD

Related Questions in DSTREAM

Popular Questions

Popular Tags

Trending Questions