Spark on cluster: I would like to know the meaning of the following error and possible causes:

811 views Asked by At

I've the follow errors/warns:

1) WARN AkkaRpcEndpointRef: Error sending message [message = Heartbeat(2,[Lscala.Tuple2;@58149ee3,BlockManagerId(2, 192.168.0.171, 49714))] in 1 attempts java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]

2) ERROR CoarseGrainedExecutorBackend: Driver 192.168.0.131:41837 disassociated! Shutting down.

I'm running a Spark (v. 1.4.0) app in a cluster of 4 machines in which the driver has less memory (4 GB) of the workers (8 Gb each one). Is it possible that the driver produces the error due to its workload?

1

There are 1 answers

0
Beniamino Del Pizzo On BEST ANSWER

The driver was not able to respond to the executors since it was under stress during the computation. The problem was solved simply by adding mroe RAM to the driver.