Error: Could not find or load main class org.apache.spark.launcher.Main in Java Spark

647 views Asked by At

I am trying to run the spark application which is written in java in GKP. For the same I am able to build the image and placed in the container. But while running the spark application with spark-submit command I am facing an error which is

Error: Could not find or load main class org.apache.spark.launcher.Main

The java and spark versions i am using for this was jdk-11 and spark-3.2.1 I am running this application via IntelliJ with maven. Also tried adding the spark-launcher maven dependency still the issue exists.

Can I know where it is going wrong with this versions.

NOTE : I can see the spark-launcher jar in the spark-3.2.1 jar folder as well.

1

There are 1 answers

1
Hadi Rahjoo On

I had that error message. It probably may have several root causes but this how I investigated and solved the problem (on linux):

  • instead of launching spark-submit, try using bash -x spark-submit to see which line fails.
  • do that process several times ( since spark-submit calls nested scripts ) until you find the underlying process called : in my case something like :

/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -cp '/opt/spark-2.2.0-bin-hadoop2.7/conf/:/opt/spark-2.2.0-bin-hadoop2.7/jars/*' -Xmx1g org.apache.spark.deploy.SparkSubmit --class org.apache.spark.repl.Main --name 'Spark shell' spark-shell

So, spark-submit launches a java process and can't find the org.apache.spark.launcher.Main class using the files in /opt/spark-2.2.0-bin-hadoop2.7/jars/* (see the -cp option above). I did an ls in this jars folder and counted 4 files instead of the whole spark distrib (~200 files). It was probably a problem during the installation process. So I reinstalled spark, checked the jar folder and it worked like a charm.

So, you should:

  • check the java command (cp option)

  • check your jars folder ( does it contain ths at least all the spark-*.jar

?)

Hope it helps.