I have a deployment that is configured with HPA and rolling update. One of my rollouts pushed a bad change to this deployment, which triggered the creation of a new replicaset. This new replicaset tried to scale up but none of the pods became ready, so the old replicaset still hung around and I had the ready pods from the old replicaset.
So far, all is as expected.
However, this deployment received a lot of traffic and needed to scale up from 1 replica to 4. The old replicaset (good) got 1 replica and the new replicaset (bad) got 2 replicas, none of which could come up. So the deployment only ended up with 2 replicas and a loss of availability.
How does an HPA scale pick which replicaset to increase replicas for? If we had a way to control this we could have prevented service errors.
Based on the official documentation, I think you can use the
spec.selector
on your HPA YAML file to target specific pods to autoscale based on the details inscaleTargetRef
. You can look sample YAML below:It is important to note that HPA is based on "demand" horizontal scaling, thus increased workload means more pods to be deployed.
Also, HPA doesn't affect objects that cannot be scaled such as Daemonset.