SparkException: Task failed while writing rows, caused by Futures timed out

16 views Asked by At

I simply do in my code :

df.write.parquet(...)

It's a simple HDFS writing of Parquet files from a given Dataframe, at the end of the spark app.

When running the app, I get a task failed while writing rows Exception, caused by a futures timed out, as you can see bellow :

TimeoutException: Futures timed out after 10 seconds

(spark.task.maxFailures is set to 1, so it's normal that a failed task, trigger an ERROR and the whole app to shutdown. I may go the default spark.task.maxFailures to 4, but I might end up hiding the root cause of that write problem)

Questions :

  1. What is the problem when writing to HDFS ?
  2. How to ìncrease the 10 seconds futures timeout ? (I see no config that is set to those 10 seconds in the Spark UI Environnement section, so it's a bit strange)

Note : it might not be related, but on others app, i also might get that strange 10 seconds future timeout, when writing to Cassandra for instance (and so not HDFS), or when communicating from Executor with the Spark driver (but the error is a simple warning here, the app continue), so I suspect global network issues.. ?

0

There are 0 answers