Apache spark 3.0 with HDP 2.6 stack

Question

Apache spark 3.0 with HDP 2.6 stack

876 views Asked by mpkd567 At 07 October 2020 at 00:48

We are planning to setup Apache Spark 3.0 outside of existing HDP 2.6 cluster and to submit the jobs using yarn(v2.7) in that cluster without upgrade or modifying. Currently users are using Spark 2.3 which is included in the HDP stack. Goal is to enable Apache Spark 3.0 outside if HDP cluster without interrupting the current jobs.

What are the best approaches for this? Setup apache 3.0 client nodes outside of HDP cluster and submit it from new client nodes?

Any recommendations on this? Things to avoid conflict with current HDP stack and its components?

Original Q&A

There are 1 answers

**mpkd567** · Answer 1 · 2020-11-13T01:11:06+00:00

Built spark 3.0.1 from the spark source code 3.0.1 with specific(HDP 2.6) Hadoop, Hive version. Then deployed it in HDP client nodes only. Spark 3.0.1 pre-built binaries were having compatibility issues with Hive 1.2.1 as it was built with latest hive.

Build options:

./build/mvn -Pyarn -Phadoop-2.7 -Dhadoop.version=2.7.3 -Phive-1.2 -Phive-thriftserver -DskipTests -Dmaven.test.skip=true clean package

TechQA.

Apache spark 3.0 with HDP 2.6 stack

There are 1 answers

Related Questions in APACHE-SPARK

Related Questions in HDP

Related Questions in SPARK3

Popular Questions

Popular Tags

Trending Questions