We are doing hourly aggregations using Spark SQL and Cassandra on huge data. We have developed a Java client which runs every hour to do the aggregations using Spark SQL. For historic loads, When we running this program for 10 days (240 Hours) of data, after around 100 hours processed, Cassandra is failing with below error:
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (no host was tried)
what's causing Cassandra to fail ?
This is resolved after configuring higher values for
"spark.cassandra.read.timeout_ms"
and"spark.cassandra.connection.timeout_ms"