Spark's takeSample() results in two stages

Question

Spark's takeSample() results in two stages

261 views Asked by Oleg Shirokikh At 11 June 2015 at 16:41

I've observed interesting behavior in Spark 1.3.1, the reason for which is not clear.

Doing something as simple as sc.textFile("...").takeSample(...) always results in two stages:

enter image description here

Original Q&A

There are 1 answers

**Justin Pihony** · Accepted Answer · 2015-06-11T17:48:37+00:00

I was able to reproduce this and the key is to focus on the details expansion. The first and second have different line numbers for their call within takeSample. The first is Line 428, which is a call to count, thus why this triggers on its own. The second is Line 447, which is the call to sample itself. This might be confusing and could possibly be fixed, but I wouldn't imagine it to be a high priority.

TechQA.

Spark's takeSample() results in two stages

There are 1 answers

Related Questions in APACHE-SPARK

Related Questions in SAMPLE

Popular Questions

Popular Tags

Trending Questions