spark-submit process does not terminate after job completion automatically

1.8k views Asked by At

I'm submitting a pyspark job using spark-submit in client mode on yarn.

spark-submit   \
          --name $APP_NAME \
          --master yarn \
          --deploy-mode client \
          --num-executors 16 \
          --executor-cores 1 \
          --driver-memory 6g \
          --executor-memory 2g \
          
          --py-files myfile.py
            --version 2.3 

The job completes successfully and I can verify that in Spark history as well as from Yarn. Even after Job completion I still see the spark-submit process running and it doesn't terminate.

I wanted to have a job status back from my calling program which invokes the submit job (Jenkins using publish over ssh plugin). Is there any way to make sure that the spark-submit process terminates with proper exit code after finishing job?

I've tried stopping spark context and putting exit status at the end of python script. This still doesn't work.

sc.stop()
sys.exit(0)

This happens randomly mostly for long running jobs. I don't see any issue for Cluster mode.

2

There are 2 answers

0
vaquar khan On

You can write unix shell script and then you are able to check starus of command via $?

  spark-submit   \
      --name $APP_NAME \
      --master yarn \
      --deploy-mode client \
      --num-executors 16 \
      --executor-cores 1 \
      --driver-memory 6g \
      --executor-memory 2g \
      
      --py-files myfile.py
        --version 2.3 

then you can check status and add your conditions

   if [ $? -eq 0 ];then
       echo 'Success'
    else
       'fail'
   fi
0
wangkang On

you can change conf --deploy-mode to cluster and try again.