I have created HPA for my deployment, it’s working fine for scaling up to max replicas (6 in my case), when load reduces its scale down to 5 but it supposed to come to my original state of replicas (1 in my case) as load becomes normal . I have verified after 30-40 mins still my application have 5 replicas .. It supposed to be 1 replica.
[ec2-user@ip-192-168-x-x ~]$ kubectl describe hpa admin-dev -n dev
Name: admin-dev
Namespace: dev
Labels: <none>
Annotations: <none>
CreationTimestamp: Thu, 24 Oct 2019 07:36:32 +0000
Reference: Deployment/admin-dev
Metrics: ( current / target )
resource memory on pods (as a percentage of request): 49% (1285662037333m) / 60%
Min replicas: 1
Max replicas: 10
Deployment pods: 3 current / 3 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ReadyForNewScale recommended size matches current size
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource utilization (percentage of request)
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 13m horizontal-pod-autoscaler New size: 2; reason: memory resource utilization (percentage of request) above target
Normal SuccessfulRescale 5m27s horizontal-pod-autoscaler New size: 3; reason: memory resource utilization (percentage of request) above target
i answered this on github: https://github.com/kubernetes/kubernetes/issues/78761#issuecomment-1075814510
heres a summary: the problem is in the calculation method that decides if it should scale down or up, the equation when scaling down works when the change in utilization due to load difference is big, usually with cpu ( e.g. 100m - 500m <=> 20% - 100%), but it fails when the change in utilization is small, usually with memory (e.g. 160Mi - 200Mi <=> 80% - 100%) for now its better to stick to CPU metric and make sure currentMetricValue at idle is at most half desiredMetricValue. you can apply this for both metrics: currentMetricValue * 2 =< desiredMetricValue
to make sure it always scales down