Prevent Kubernetes from restarting job on OOM

Question

Prevent Kubernetes from restarting job on OOM

608 views Asked by Brent212 At 08 June 2022 at 04:31

I'm having a problem where a job runs out of memory, and K8s is continually trying to run it again, despite it having no chance of succeeding, since it's going to use the same amount of memory every time. I want it to simply let the job fail and sit there, and I'll take care of creating a new one with a higher memory limit, if desired, and/or deleting the existing failed job.

I have

  restartPolicy: Never
backoffLimit: 0

From the not-so-clear things I've read, setting backoffLimit to 1 might do the trick. But is that true? Would that make it restart once, or is the 1 the number of times it can be run, including the first attempt?

Should I switch from jobs to pods? The main issue with that, is then I don't think K8s will restart the pod on another K8s worker node should the one it's running on go down, and that's a situation where I'd want the job to automatically be restarted on another node.

Original Q&A

There are 2 answers

**P Ekambaram** · Answer 1 · 2022-06-08T05:00:26+00:00

P Ekambaram On 08 June 2022 at 05:00

backoffLimit should be 1 as shown below

backoffLimit: 1

**Fritz Duchardt** · Answer 2 · 2022-06-08T06:31:22+00:00

Setting backoffLimit to 0 is correct, if the Job is supposed to run once and not be restarted:

backoffLimit: Specifies the number of retries before marking this job failed.

Switching your workload to a Pod would make sense as long as you are not interested in restarts in combination with backoff limits.

TechQA.

Prevent Kubernetes from restarting job on OOM

There are 2 answers

Related Questions in KUBERNETES

Related Questions in MEMORY

Related Questions in JOBS

Related Questions in OOM

Popular Questions

Popular Tags

Trending Questions