Metrics server is currently unable to handle the request

61.1k views Asked by At

I am new to kubernetes and was trying to apply horizontal pod autoscaling to my existing application. and after following other stackoverflow details - got to know that I need to install metric-server - and I was able to - but some how it's not working and unable to handle request.

  • Further I followed few more things but unable to resolve the issue - I will really appreciate any help here. Please let me know for any further details you need for helping me :) Thanks in advance.

Steps followed:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created

kubectl get deploy,svc -n kube-system | egrep metrics-server

deployment.apps/metrics-server   1/1     1            1           2m6s
service/metrics-server                       ClusterIP   10.32.0.32   <none>        443/TCP                        2m6s

kubectl get pods -n kube-system | grep metrics-server

metrics-server-64cf6869bd-6gx88   1/1     Running   0          2m39s

vi ana_hpa.yaml

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: ana-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: StatefulSet
    name: common-services-auth
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 80
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 160
    

k apply -f ana_hpa.yaml

horizontalpodautoscaler.autoscaling/ana-hpa created

k get hpa

NAME      REFERENCE                          TARGETS                         MINPODS   MAXPODS   REPLICAS   AGE
ana-hpa   StatefulSet/common-services-auth   <unknown>/160%, <unknown>/80%   1         10        0          4s

k describe hpa ana-hpa

Name:                                                     ana-hpa
Namespace:                                                default
Labels:                                                   <none>
Annotations:                                              <none>
CreationTimestamp:                                        Tue, 12 Apr 2022 17:01:25 +0530
Reference:                                                StatefulSet/common-services-auth
Metrics:                                                  ( current / target )
  resource memory on pods  (as a percentage of request):  <unknown> / 160%
  resource cpu on pods  (as a percentage of request):     <unknown> / 80%
Min replicas:                                             1
Max replicas:                                             10
StatefulSet pods:                                         3 current / 0 desired
Conditions:
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetResourceMetric  the HPA was unable to compute the replica count: failed to get memory utilization: unable to get metrics for resource memory: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
Events:
  Type     Reason                        Age                  From                       Message
  ----     ------                        ----                 ----                       -------
  Warning  FailedGetResourceMetric       38s (x8 over 2m23s)  horizontal-pod-autoscaler  failed to get cpu utilization: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
  Warning  FailedComputeMetricsReplicas  38s (x8 over 2m23s)  horizontal-pod-autoscaler  invalid metrics (2 invalid out of 2), first error is: failed to get memory utilization: unable to get metrics for resource memory: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
  Warning  FailedGetResourceMetric       23s (x9 over 2m23s)  horizontal-pod-autoscaler  failed to get memory utilization: unable to get metrics for resource memory: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
  
  

kubectl get --raw /apis/metrics.k8s.io/v1beta1

Error from server (ServiceUnavailable): the server is currently unable to handle the request

kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes"

Error from server (ServiceUnavailable): the server is currently unable to handle the request

kubectl edit deployments.apps -n kube-system metrics-server

Add hostNetwork: true

deployment.apps/metrics-server edited

kubectl get pods -n kube-system | grep metrics-server

metrics-server-5dc6dbdb8-42hw9 1/1 Running 0 10m

k describe pod metrics-server-5dc6dbdb8-42hw9 -n kube-system

Name:                 metrics-server-5dc6dbdb8-42hw9
Namespace:            kube-system
Priority:             2000000000
Priority Class Name:  system-cluster-critical
Node:                 pusntyn196.apac.avaya.com/10.133.85.196
Start Time:           Tue, 12 Apr 2022 17:08:25 +0530
Labels:               k8s-app=metrics-server
                      pod-template-hash=5dc6dbdb8
Annotations:          <none>
Status:               Running
IP:                   10.133.85.196
IPs:
  IP:           10.133.85.196
Controlled By:  ReplicaSet/metrics-server-5dc6dbdb8
Containers:
  metrics-server:
    Container ID:  containerd://024afb1998dce4c0bd5f4e58f996068ea37982bd501b54fda2ef8d5c1098b4f4
    Image:         k8s.gcr.io/metrics-server/metrics-server:v0.6.1
    Image ID:      k8s.gcr.io/metrics-server/metrics-server@sha256:5ddc6458eb95f5c70bd13fdab90cbd7d6ad1066e5b528ad1dcb28b76c5fb2f00
    Port:          4443/TCP
    Host Port:     4443/TCP
    Args:
      --cert-dir=/tmp
      --secure-port=4443
      --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
      --kubelet-use-node-status-port
      --metric-resolution=15s
    State:          Running
      Started:      Tue, 12 Apr 2022 17:08:26 +0530
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:        100m
      memory:     200Mi
    Liveness:     http-get https://:https/livez delay=0s timeout=1s period=10s #success=1 #failure=3
    Readiness:    http-get https://:https/readyz delay=20s timeout=1s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /tmp from tmp-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-g6p4g (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  tmp-dir:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  kube-api-access-g6p4g:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              kubernetes.io/os=linux
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 2s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 2s
Events:
  Type    Reason     Age    From               Message
  ----    ------     ----   ----               -------
  Normal  Scheduled  2m31s  default-scheduler  Successfully assigned kube-system/metrics-server-5dc6dbdb8-42hw9 to pusntyn196.apac.avaya.com
  Normal  Pulled     2m32s  kubelet            Container image "k8s.gcr.io/metrics-server/metrics-server:v0.6.1" already present on machine
  Normal  Created    2m31s  kubelet            Created container metrics-server
  Normal  Started    2m31s  kubelet            Started container metrics-server
  

kubectl get --raw /apis/metrics.k8s.io/v1beta1

Error from server (ServiceUnavailable): the server is currently unable to handle the request

kubectl get pods -n kube-system | grep metrics-server

metrics-server-5dc6dbdb8-42hw9   1/1     Running   0          10m

kubectl logs -f metrics-server-5dc6dbdb8-42hw9 -n kube-system

E0412 11:43:54.684784       1 configmap_cafile_content.go:242] kube-system/extension-apiserver-authentication failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
E0412 11:44:27.001010       1 configmap_cafile_content.go:242] key failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"

k logs -f metrics-server-5dc6dbdb8-42hw9 -n kube-system
    I0412 11:38:26.447305       1 serving.go:342] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
    I0412 11:38:26.899459       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
    I0412 11:38:26.899477       1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
    I0412 11:38:26.899518       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
    I0412 11:38:26.899545       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
    I0412 11:38:26.899546       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
    I0412 11:38:26.899567       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
    I0412 11:38:26.900480       1 dynamic_serving_content.go:131] "Starting controller" name="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key"
    I0412 11:38:26.900811       1 secure_serving.go:266] Serving securely on [::]:4443
    I0412 11:38:26.900854       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
    W0412 11:38:26.900965       1 shared_informer.go:372] The sharedIndexInformer has started, run more than once is not allowed
    I0412 11:38:26.999960       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
    I0412 11:38:26.999989       1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController
    I0412 11:38:26.999970       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
    E0412 11:38:27.000087       1 configmap_cafile_content.go:242] kube-system/extension-apiserver-authentication failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
    E0412 11:38:27.000118       1 configmap_cafile_content.go:242] key failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"

kubectl top nodes

Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)

kubectl top pods

Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)

Edit metrics server deployment yaml

Add - --kubelet-insecure-tls

k apply -f metric-server-deployment.yaml

serviceaccount/metrics-server unchanged
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader unchanged
clusterrole.rbac.authorization.k8s.io/system:metrics-server unchanged
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader unchanged
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator unchanged
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server unchanged
service/metrics-server unchanged
deployment.apps/metrics-server configured
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io unchanged

kubectl get pods -n kube-system | grep metrics-server

metrics-server-5dc6dbdb8-42hw9   1/1     Running   0          10m

kubectl top pods

Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)

Also tried by adding below to metrics server deployment

    command:
    - /metrics-server
    - --kubelet-insecure-tls
    - --kubelet-preferred-address-types=InternalIP
9

There are 9 answers

6
d2k On

This can easily be resolved by editing the deployment yaml files and adding the hostNetwork: true after the dnsPolicy: ClusterFirst

kubectl edit deployments.apps -n kube-system metrics-server

insert:

hostNetwork: true
0
molokovskikh On

I hope this help somebody for bare metal cluster:

$ helm --repo https://kubernetes-sigs.github.io/metrics-server/ --kubeconfig=$HOME/.kube/loc-cluster.config -n kube-system --set args='{--kubelet-insecure-tls}' upgrade --install metrics-server metrics-server
$ helm --kubeconfig=$HOME/.kube/loc-cluster.config -n kube-system uninstall metrics-server
0
Geoffrey On

For me on EKS with helmfile I had to write in the values.yaml using the metrics-server chart :

containerPort: 10250

The value was enforced by default to 4443 for an unknown reason when I first deployed the chart.

See doc:

Then kubectl top nodes and kubectl describe apiservice v1beta1.metrics.k8s.io were working.

0
Mostafa Ghadimi On

First of all, execute the following command:

kubectl get apiservices

And checkout the availablity (status) of kube-system/metrics-server service.

  • In case the availability is True: Add hostNetwork: true to the spec of your metrics-server deployment by executing the following command:

    kubectl edit deployment -n kube-system metrics-server
    

    It should look like the following:

    ...
    spec:
      hostNetwork: true
    ...
    
    

    Setting hostNetwork to true means that Pod will have access to the host where it's running.

  • In case the availability is False (MissingEndpoints):

    1. Download metrics-server:

      wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.5.0/components.yaml
      
    2. Remove (legacy) metrics server:

      kubectl delete -f components.yaml  
      
    3. Edit downloaded file and add - --kubelet-insecure-tls to args list:

      ...
      labels:
          k8s-app: metrics-server
      spec:
        containers:
        - args:
          - --cert-dir=/tmp
          - --secure-port=443
          - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
          - --kubelet-use-node-status-port
          - --metric-resolution=15s
          - --kubelet-insecure-tls # add this line
      ...
      
    4. Create service once again:

      kubectl apply -f components.yaml
      
0
Immanuel On

For us on a Google Cloud GKE private cluster we had to add a firewall rule to allow for this traffic.

The use case here is a "Aggregated API server" and the process on how to add the correct firewall rule is described in the private cluster docs.

For us we had to allow the port 6443 from the failure message inside kubectl describe apiservice v1beta1.custom.metrics.k8s.io:

Status:
  Conditions:
    Last Transition Time:  2023-03-07T11:17:17Z
    Message:               failing or missing response from https://10.4.1.14:6443 [...]
0
s_musarat On

For me, this was occurring on my local k3s cluster and found out that restarting the k3s service resolves it after 5 to 10 minutes.

sudo systemctl restart k3s
0
SilentEntity On

Please configuration aggregation layer correctly and carefully, you can use this link for help : https://kubernetes.io/docs/tasks/extend-kubernetes/configure-aggregation-layer/.

apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  name: <name of the registration object>
spec:
  group: <API group name this extension apiserver hosts>
  version: <API version this extension apiserver hosts>
  groupPriorityMinimum: <priority this APIService for this group, see API documentation>
  versionPriority: <prioritizes ordering of this version within a group, see API documentation>
  service:
    namespace: <namespace of the extension apiserver service>
    name: <name of the extension apiserver service>
  caBundle: <pem encoded ca cert that signs the server cert used by the webhook> 

It would be helpful to provide kubectl version return value.

2
zer0 On

Update: I deployed the metrics-server using the same command. Perhaps you can start fresh by removing existing resources and running:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

=======================================================================

It appears the --kubelet-insecure-tls flag was not configured correctly for the pod template in the deployment. The following should fix this:

  1. Edit the existing deployment in the cluster with kubectl edit deployment/metrics-server -nkube-system.
  2. Add the flag to the spec.containers[].args list, so that the deployment looks like this:
...
    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-insecure-tls      <=======ADD IT HERE.
        image: k8s.gcr.io/metrics-server/metrics-server:v0.6.1
...
  1. Simply save your changes and let the deployment rollout the updated pods. You can use watch -n1 kubectl get deployment/kube-metrics -nkube-system and wait for UP-TO-DATE column to show 1.

Like this:

NAME             READY   UP-TO-DATE   AVAILABLE   AGE
metrics-server   1/1     1            1           16m
  1. Verify with kubectl top nodes. It will show something like
NAME             CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
docker-desktop   222m         5%     1600Mi          41%

I've just verified this to work on a local setup. Let me know if this helps :)

0
user23197111 On

I worked in AWS EKS. I've had the same problem after instalation Helm Chart. Problem was in misconfiguration of the namespace and port in my deployment and v1beta1.metrics.k8s.io

I ran this command to check it:

# kubectl describe deployment metrics-server -n <Your namespace>

# kubectl describe apiservice v1beta1.metrics.k8s.io

# kubectl edit deployment metrics-server -n "Your namespace"

# kubectl edit apiservice v1beta1.metrics.k8s.io

After i set it up to 4443 and same "namespace" I checked with command:

# kubectl top nodes

and got positive result. Here is the link for guys who worked with AWS as well: enter link description here