Is it possible to run Spark (2.3) jobs on hadoop3 clusters specifically HDP 3.1 and CDH6 (beta)

418 views Asked by At

Also, CDH 6 is in beta stage and do they support spark 2.3 without any bells and whistles? is it possible to run the same old spark 2.x versions (2.3 specifically) on hadoop 3 enabled CDH or Hadoop clusters?

I'm interested in knowing the backwards compatibility changes with yarn , hdfs and mapreduce API's.

Is anyone using this in production?

1

There are 1 answers

3
mazaneicha On

CDH 6.0 GA was announced a couple of weeks ago. In addition to Hadoop 3, it also packages Spark 2.2 as the default Spark version: https://www.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_cdh_600_new_features.html#spark_new_features. However, it is possible to upgrade CDS to a higher (2.3.x) version separately.
CDH 6 seems to be unaffected by HMS incompatibility in Spark according to https://www.cloudera.com/documentation/spark2/latest/topics/spark2_troubleshooting.html#spark_troubleshooting__hive_compatibility.