What is the best way to monitor and show the results of Async jobs (like EMR & AWS glue) which take 20-30 minutes to execute

179 views Asked by At

I have a job for my program which takes a long time to execute. Now I want to show the status of this job to my UI once it is completed. I have found two solutions to this problem:

  1. Have an api call execute at the end of the 30 minute job to update the status that the job is complete. This is good because it can give additional information as to what happened in the job, but has it's drawback in that if something goes completely wrong, theres a chance that the code which calls the api will never happen and hence the status will never update.
  2. Have continuous monitoring on this task once it has started. Have a while loop and keep checking if the task is done. This is a good approach in that we can almost always get the correct status of the task, but often we can only see the high level yes/no here instead of being able to see the fine grained execution details which might be made available.

One thing I haven't implemented though which I think my be a good solution is having both of these solutions in tandem does both so if there is a success case, I get the details of the execution. In case of total failure, I get that output as well from the other monitoring tool. What are the general principles followed when building such monitoring support for jobs which take longer times to process?

1

There are 1 answers

0
Ngenator On BEST ANSWER

Use AWS Step Functions as a serverless state machine. It has support for interacting directly with a bunch of services https://docs.aws.amazon.com/en_us/step-functions/latest/dg/connect-supported-services.html