Setting default.parallelism in spark-submit command

2.8k views Asked by At

What is the syntax to change the default parallelism when doing a spark-submit job?

I can specify the number of executors, executor cores and executor memory by the following command when submitting my spark job:

spark-submit --num-executors 9 --executor-cores 5 --executor-memory 48g

Specifying the parallelism in the conf file is :

spark.conf.set("spark.default.parallelism",90)

If I were to change it in the spark-submit command, would it be ?:

spark-submit --default.parallelism 90
1

There are 1 answers

1
Michael Heil On BEST ANSWER

According to the Spark Documentation on Launching Application with spark-submit the spark-submit command has the following syntax:

./bin/spark-submit \
  --class <main-class> \
  --master <master-url> \
  --deploy-mode <deploy-mode> \
  --conf <key>=<value> \
  ... # other options
  <application-jar> \
  [application-arguments]

In your case you need to add the following if you want to change the mentioned configuration.

spark-submit [...] --conf spark.default.parallelism=90