I have a working 1.23.9 kubernetes cluster hosted on Google Kubernetes Engine with multi-cluster services enabled, one cluster hosted in us and another in eu. I have multiple deployment apps and hpa configured for each through YAML. Out of 7 deployment apps, HPA is only working for one app. service-1 can only be accessed from service-2 internally and service-2 is exposed through HttpGateway by GKE. Please find more info below. Any help would be extremely appreciated.
Deployment file, I have posted only 2 apps, service-2's HPA is working fine, whereas service-1's is not.
$ cat deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: service-1
namespace: backend
labels:
app: service-1
spec:
replicas: 1
selector:
matchLabels:
lbtype: internal
template:
metadata:
labels:
lbtype: internal
app: service-1
spec:
containers:
- name: service-1
image: [REDACTED]
ports:
- containerPort: [REDACTED]
name: "[REDACTED]"
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
imagePullSecrets:
- name: docker-gcr
restartPolicy: Always
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: service-2
namespace: backend
labels:
app: service-2
spec:
replicas: 2
selector:
matchLabels:
lbtype: external
template:
metadata:
labels:
lbtype: external
app: service-2
spec:
containers:
- name: service-2
image: [REDACTED]
ports:
- containerPort: [REDACTED]
name: "[REDACTED]"
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
imagePullSecrets:
- name: docker-gcr
restartPolicy: Always
HorizontalPodScaler file:
$ cat horizontal-pod-scaling.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: service-1
namespace: backend
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: service-1
minReplicas: 1
maxReplicas: 2
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: service-2
namespace: backend
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: service-2
minReplicas: 2
maxReplicas: 4
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Service file:
$ cat service.yaml
apiVersion: v1
kind: Service
metadata:
name: backend-internal
namespace: backend
spec:
type: ClusterIP
ports:
- name: service-1
port: [REDACTED]
targetPort: "[REDACTED]"
selector:
lbtype: internal
---
apiVersion: v1
kind: Service
metadata:
name: backend-middleware
namespace: backend
spec:
ports:
- name: service-2
port: [REDACTED]
targetPort: "[REDACTED]"
selector:
lbtype: external
$ kctl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
service-1 Deployment/service-1 <unknown>/70% 1 2 1 18h
service-2 Deployment/service-2 4%/70% 2 4 2 18h
$ kctl top pod
NAME CPU(cores) MEMORY(bytes)
service-1-8f7dc66cc-xtz76 3m 66Mi
service-2-5fd767cbc-vm7f5 4m 76Mi
$ kubectl describe deployment metrics-server-v0.5.2 -nkube-system
Name: metrics-server-v0.5.2
Namespace: kube-system
CreationTimestamp: Fri, 02 Dec 2022 11:01:18 +0530
Labels: addonmanager.kubernetes.io/mode=Reconcile
k8s-app=metrics-server
version=v0.5.2
Annotations: components.gke.io/layer: addon
deployment.kubernetes.io/revision: 4
Selector: k8s-app=metrics-server,version=v0.5.2
Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
...
Containers:
metrics-server:
Image: gke.gcr.io/metrics-server:v0.5.2-gke.1
Port: 10250/TCP
Host Port: 10250/TCP
Command:
/metrics-server
--metric-resolution=30s
--kubelet-port=10255
--deprecated-kubelet-completely-insecure=true
--kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
--cert-dir=/tmp
--secure-port=10250
$ kctl describe hpa service-1
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ReadyForNewScale recommended size matches current size
ScalingActive False FailedGetResourceMetric the HPA was unable to compute the replica count: no recommendation
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetResourceMetric 2m (x4470 over 18h) horizontal-pod-autoscaler no recommendation
$ kctl describe hpa service-2
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ReadyForNewScale recommended size matches current size
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited True TooFewReplicas the desired replica count is less than the minimum replica count
Events: <none>
As per my understanding ScalingActive=False It should not affect the auto scaling in a major way.
Check below possible solutions :
1)Check The Resource Metric : You can remove the LIMITS from your deployments and try it. Try only Pod's containers must be set relevant REQUESTS for RESOURCES at the deployment level and it may work. If you see the HPA is working then later you can play with LIMITS as well. This discussion tells you that only using REQUESTS is sufficient to do the HPA.
2)FailedGetResourceMetric : Check if metric is registered and available (also look at "Custom and external metrics"). Try executing the commands
kubectl top node
andkubectl top pod -A
to verify that metrics-server is working properly.The HPA controller runs regularly to check if any adjustments to the system are required. During each run, the controller manager queries the resource utilization against the metrics specified in each HorizontalPodAutoscaler definition. The controller manager obtains the metrics from either the resource metrics API (for per-pod resource metrics).
Basically HPA targets deployment by name, uses deployment selector labels to get pod's metrics. One may have two deployments that use the same selector and then HPA would get metrics for pods of both deployments. Try the same deployment with a kind cluster and it may work fine.
3)Kubernetes Metrics Server is a scalable, efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines. Metrics Server for CPU/Memory based horizontal autoscaling. Check Requirements : Kubernetes Metrics Server has specific requirements for cluster and network configuration. These requirements aren't the default for all cluster distributions. Please ensure that your cluster distribution supports these requirements before using Metrics Server.
4)HPA process scaleup event every 15-30 seconds and It may take around 3-4 min because of latency of metrics data.
5)Check this relevant SO for more information.