Scaling kube-state-metrics in prometheus-operator

1.1k views Asked by At

In Prometheus-operator, I want to increase the kube-state-metrics replicas to 2. If I increase the replicas, and as the default service discovery role is endpoints, Prometheus will scrape each pod so I'll have all metrics scraped twice that will cause many-to-many issues and it's a waste.

The issue I had was a node that went down that had the kube-state-metrics on it among others. I didn't know what was going on my cluster till a new pod was scheduled. It's important for me to have the kube-state-metrics redundant.

How can I configure the kubernetes_sd_configs role for kube-state-metrics to be service so it'll the service as a load balancer and not each pod in the service? OR - how can I scale the kube-state-metrics pods (without sharding)?

Current config:

- job_name: monitoring/prometheus-operator-kube-state-metrics/0
  kubernetes_sd_configs:
  - role: endpoints

What I want:

- job_name: monitoring/prometheus-operator-kube-state-metrics/0
  kubernetes_sd_configs:
  - role: service
1

There are 1 answers

0
Evgeny Zislis On BEST ANSWER

Yes, you can.

While your job that scrapes endpoints is filtering services that include the annotation prometheus.io/scrape: "true" you can choose to use a different annotation for scraping the services themselves.

Where you have a job like this which scrapes each endpoint individually:

- job_name: kubernetes-endpoints                                                                                  
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  kubernetes_sd_configs:
    - role: endpoints
  relabel_configs:
    - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
      action: keep
      regex: "true"

You can add another job, that will only scrape the service as the endpoint:

- job_name: kubernetes-services
  params:
    module: [http_2xx]
  kubernetes_sd_configs:
    - role: service
  relabel_configs:
    - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
      action: keep
      regex: "true"

Then just make sure you set the correct annotations on the service, like so:

apiVersion: v1                
kind: Service                                                                                                     
metadata:                                  
  annotations:
    prometheus.io/path: /metrics
    prometheus.io/probe: "true"