How to use external Spark with the Cloudera cluster?

48 views Asked by yagoaparecidoti At 23 January 2024 at 18:59

I need to use Spark on a host that is not part of the Cloudera cluster to run Spark jobs on the Cloudera cluster.

Is it possible to use it this way? If yes, how to configure?

what I've already tried:

1. Download "https://www.apache.org/dyn/closer.lua/spark/spark-3.3.4/spark-3.3.4-bin-hadoop3.tgz"

2. Copy the "conf" files from the Cloudera cluster and send them to the new Spark directory

3. exported the variables "HADOOP_CONF_DIR" and "SPARK_CONF_DIR" and "SPARK_HOME" using the new spark directory "spark-3.3.4-bin-hadoop3" with the files

4. When trying to run spark-shell as an example, nothing happens:

it hangs as shown below:

Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.3.4
      /_/

Using Scala version 2.13.8 (Java HotSpot(TM) 64-Bit Server VM, Java 11.0.16.1)
Type in expressions to have them evaluated.
Type :help for more information.

note: the cluster has kerberos, so before running spark-shell, kinit was run

Original Q&A

TechQA.

How to use external Spark with the Cloudera cluster?

There are 0 answers

Related Questions in APACHE-SPARK

Related Questions in PYSPARK

Related Questions in CLOUDERA

Related Questions in SPARK-SUBMIT

Related Questions in SPARK-SHELL

Popular Questions

Trending Questions