I have been exploring Mesos, Marathon framework to deploy applications. I have a doubt that how Marathon handle application files when an application is killed .
For example we are using Jenkins which is run through Marathon and if Jenkins server fails and it will be restarted again by Marathon but this time old jobs defined will be lost .
Now my question is how can I ensure that if a application restarts, those old application jobs should be available ?
Thanks.
As of right now mesos/marathon is great at supporting stateless applications, but the support for stateful applications is increasing. By default the task data is written into sandbox and hence will be lost when a task is failed/restarted. Note that usually only a small percentage of tasks fails (e.g. only the tasks being on the failed node).
Now let us have a look at different failure scenarios.
Recovering from slave process failures: When only the Mesos slave process fails (or is upgraded) the framework can use slave checkpointing for reconnecting to the running executors.
Executor failures (e.g. Jenkins process failures): In this case the framework could persist it own metadata on some persistent media and use it to restart. Note, that this is highly application specific and hence mesos/marathon can not offer a generic way to do this (and I am actually not sure how that could look like in case of jenkins). Persistent data could either be written to HDFS, Cassandra or you could have a look at the concept of dynamic reservations.