Kubernetes HPA not downscaling as expected

3.2k views Asked by At

What happened: I've configured a hpa with these details:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: api-horizontalautoscaler
  namespace: develop
spec:
  scaleTargetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: api-deployment
  minReplicas: 1
  maxReplicas: 4
  metrics:
  - type: Resource
    resource:

      name: memory
      targetAverageValue: 400Mib

What I expected to happen: The pods scaled up to 3 when we put some load and the average memory exceeded 400 which was expected. Now the average memory has gone back down to roughly 300 and still the pods haven't scaled down even though they have been below the target for a couple of hours now. image

A day later: image

I expected the pods to scale down when the memory fell below 400

Environment:

  • Kubernetes version (using kubectl version):
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.9", GitCommit:"3e4f6a92de5f259ef313ad876bb008897f6a98f0", GitTreeState:"clean", BuildDate:"2019-08-05T09:22:00Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.10", GitCommit:"37d169313237cb4ceb2cc4bef300f2ae3053c1a2", GitTreeState:"clean", BuildDate:"2019-08-19T10:44:49Z", GoVersion:"go1.11.13", Compiler:"gc", Platform:"linux/amd64"}re configuration:
  • OS (e.g: cat /etc/os-release):
> cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.3 LTS (Bionic Beaver)"
  • Kernel (e.g. uname -a): x86_64 x86_64 x86_64 GNU/Linux

I would really like to know why this is. Any information needed I will be happy to provide.

Thanks!

2

There are 2 answers

2
Wytrzymały Wiktor On BEST ANSWER

There are two things to look at:

The beta version, which includes support for scaling on memory and custom metrics, can be found in autoscaling/v2beta2. The new fields introduced in autoscaling/v2beta2 are preserved as annotations when working with autoscaling/v1.

The autoscaling/v2beta2 was introduced in K8s 1.12 so despite the fact you are using 1.13 (which is 6 major versions old now) it should work fine (however, upgrading to a newer version is recommended). Try changing your apiVersion: to autoscaling/v2beta2.

--horizontal-pod-autoscaler-downscale-stabilization: The value for this option is a duration that specifies how long the autoscaler has to wait before another downscale operation can be performed after the current one has completed. The default value is 5 minutes (5m0s).

Check the value of this particular flag after changing the API suggested above.

2
David Maze On

The formula for how the HPA decides how many pods to run is in the Horizontal Pod Autoscaler documentation:

desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]

With the numbers you give, currentReplicas is 3, currentMetricValue is 300 MiB, and desiredMetricValue is 400 MiB, so this reduces to

desiredReplicas = ceil[3 * (300 / 400)]
desiredReplicas = ceil[3 * 0.75]
desiredReplicas = ceil[2.25]
desiredReplicas = 3

You need to decrease the load further (below 266 MiB average memory utilization) or increase the target memory utilization for this to scale down more.

(Simply being below the target won't trigger scale-down on its own, you must be enough below the target for this formula to produce a lower number. This helps avoid thrashing if the load is right around a threshold that would trigger scaling in one direction or the other.)