I have a job that can take up to several hours. It is possible that for some reason (like out of memory, or cluster rebalance) it just fails. The problem is that the job is usually run overnight, and someone needs to check on it in the morning, and manually restart it (which most of the time is enough). I was wondering if this problem can be solved using spring cloud data flow.
Ideally, I would want SCDF to send an email (or call a webhook) when a job is done (failed or success), and retry an entire job if it fails. Is it possible to do that?
SCDF is a lightweight Spring Boot application, and it exposes a set of RESTful APIs, so you can leverage the APIs to build the desired automation.
There's currently no in-built email functionality to automate this workflow out-of-the-box.
You could, however, write a small application that periodically interacts with SCDF's RESTful APIs, and depending on the desired stateful scenarios, you could kick off the email and/or relaunch operations.