Debug failed shuffles in hadoop map reduces

143 views Asked by At

I am seeing as the size of the input file increase failed shuffles increases and job complete time increases non linearly.

eg.

75GB took 1h
86GB took 5h

I also see average shuffle time increase 10 fold

eg.

75GB 4min
85GB 41min

Can someone point me a direction to debug this?

1

There are 1 answers

0
FabioFLX On

Whenever you are sure your algorithms are correct, automatic hard-disk volumes partioning or fragmentation problems may occur somewhere after that 75Gb threshold, as of you are probably using the same filesystem for caching the results.