How to invoke an oozie workflow via shell script and block/wait till workflow completion

3.9k views Asked by At

I have created a workflow using Oozie that is comprised of multiple action nodes and have been successfully able to run those via coordinator.

I want to invoke the Oozie workflow via a wrapper shell script.

The wrapper script should invoke the Oozie command, wait till the oozie job completes (success or error) and return back the Oozie success status code (0) or the error code of the failed oozie action node (if any node of the oozie workflow has failed).

From what I have seen so far, I know that as soon as I invoke the oozie command to run a workflow, the command exits with the job id getting printed on linux console, while the oozie job keeps running asynchronously in the backend.

I want my wrapper script to block till the oozie coordinator job completes and return back the success/error code.

Can you please let me know how/if I can achieve this using any of the oozie features?

I am using Oozie version 3.3.2 and bash shell in Linux.

Note: In case anyone is curious about why I need such a feature - the requirement is that my wrapper shell script should know how long an oozie job has been runnig, when an oozie job has completed, and accordingly return back the exit code so that the parent process that is calling the wrapper script knows whether the job completed successfully or not, and if errored out, raise an alert/ticket for the support team.

2

There are 2 answers

2
codingmonkey On

To upload workflow definition to HDFS use the following command :

hdfs dfs -copyFromLocal -f workflow.xml /user/hdfs/workflows/workflow.xml

To fire up Oozie job you need these two commands at the below Please Notice that to write each on a single line.

JOB_ID=$(oozie job -oozie http://<oozie-server>/oozie -config job.properties -submit)

oozie job -oozie http://<oozie-server>/oozie -start ${JOB_ID#*:} -config job.properties

You need to parse result coming from below command when the returning result = 0 otherwise it's a failure. Simply loop with sleep X amount of time after each trial.

oozie job -oozie http://<oozie-server>/oozie -info ${JOB_ID#*:}

echo $? //shows whether command executed successfully or not

0
Garry On

You can do that by using the job id then start a loop and parsing the output of oozie info. Below is the shell code for same.

Start oozie job

oozie_job_id=$(oozie job -oozie http://<oozie-server>/oozie -config job.properties -run );
echo $oozie_job_id;
sleep 30;

Parse job id from output. Here job_id format is "job: jobid"

job_id=$(echo $oozie_job_id | sed -n 's/job: \(.*\)/\1/p');
echo $job_id;

check job status at regular interval, if its Running or not

while [ true ]
do
   job_status=$(oozie job --oozie http://<oozie-server>/oozie -info $job_id | sed -n 's/Status\(.*\): \(.*\)/\2/p');
    if [ "$job_status" != "RUNNING" ];
    then
        echo "Job is completed with status $job_status";
        break;
    fi
    #this sleep depends on you job, please change the value accordingly
    echo "sleeping for 5 minutes";
    sleep 5m
done 

This is basic way to do it, you can modify it as per you use case.