wait for metrics to become available in metrics-server

230 views Asked by At

I have a service set up in Kubernetes which seems to be a fairly normal: deployment, service, and HPA. However, it does something which I'd like to fix. The sequence of events goes like this:

  1. We change the deployed image, which creates new pods.
  2. The pods become healthy and enter the service through the label selector.
  3. The HPA enters an unhealthy state because it cannot read the new pod metrics.
  4. I get notified through Argo rollouts that the HPA is unhealthy.

I'd like to somehow delay pods entering service until their metrics are ready so we don't get this false alarm on every deploy.

Right now, we solve this by waiting 60 seconds before changing the labels in our blue/green rollout script, but that's pretty unsatisfying!

I think I could also do this by creating a liveness probe that asked for the pod's metrics, but it seems like a lot of hassle for something that seems like it should be easy. (for example, it doesn't look like I have the current namespace in the environment by default. I guess I could get it with the downward API, but I'd also have to bundle curl or kubectl in my container images even if I had it, which I'd prefer not to do.)

Anyway, are other people even seeing this? If so, how are you solving it?


Editing to add information requested in a comment: we use Kubernetes 1.21 on an Amazon EKS cluster.

0

There are 0 answers