HorizontalPodAutoscaler manifest not filtering metricSelector.matchLabels

61 views Asked by At

We are working with Redis Queues and exposing some metrics with Prometheus Adapter so then we can use HPA to scale our deployments based on the number of records with "Queued" status.

The problem I'm seeing is that no matter how I try to filter the HPA metrics, it is considering the whole number instead of the specific matchLabels.metricSelector I'm providing.

The HPA is working as it should, but without considering the Scanner number, but considering everything returned from the External API.

See below.

"CloneRepo" and "Semgrep" would sum 370. If I apply the HPA only for "semgrep" scanner, it would scale to 13 PODs because of the total sum of 370, instead of considering the expect value of 0.

It seems it's not considering (or it's incorrectly defined) the matchLabels.metricSelector in the manifest.

Does anyone have any idea what is wrong here?

K8s: 1.27 (running on AWS EKS) Namespace: scanners

A simple sample of RQ_JOBS metrics from Prometheus look like this:

clonerepo - 370 records in "Queued" status semgrep - 0 records in "Queued" status

  • Prometheus Metrics for rq_jobs. Notice the "scm_instance" and "queue" labels.
rq_jobs{container="rq-exporter-bitbucket", endpoint="http-metrics", instance="100.64.13.45:9726", job="rq-exporter-bitbucket", namespace="scanners", pod="rq-exporter-bitbucket-5879d9bd4b-zcnns", queue="clonerepo", scm_instance="rq-exporter-bitbucket", service="rq-exporter-bitbucket", status="queued"}
370
rq_jobs{container="rq-exporter-bitbucket", endpoint="http-metrics", instance="100.64.13.45:9726", job="rq-exporter-bitbucket", namespace="scanners", pod="rq-exporter-bitbucket-5879d9bd4b-zcnns", queue="semgrep", scm_instance="rq-exporter-bitbucket", service="rq-exporter-bitbucket", status="queued"}
0

HPA Manifest

  • Notice the matchLabels section, this is where we expect to filter the Metric Name exposed by Prometheus Adapter and the Queue and SCM Instance.

  • If I use autoscaling/v1 with targetAverageValue it works. When I switch to autoscaling/v2, then the HPA goes to CPU Utilization, which is kind of different of what I've read over the internet. Anyway, this is not the problem, with v1 it's working based on Numbers, but not filtering as expected.

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-semgrep
  namespace: scanners
  annotations:
    autoscaling.alpha.kubernetes.io/metrics: |
      [
        {
          "type": "External",
          "external": {
            "metricName": "hpa_rq_jobs",
            "metricSelector": {
              "matchLabels": {
                  "queue": "semgrep",
                  "scm_instance": "rq-exporter-bitbucket"
              }
            },
            "targetAverageValue": "30"
          }
        }
      ]
spec:
  maxReplicas: 20
  minReplicas: 2
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: deploy-scanner-semgrep

This is the Prometheus adapater definition:

external:
    - seriesQuery: '{__name__=~"rq_jobs", namespace="scanners", scm_instance!="", status="queued"}'
      resources:
        overrides:
          namespace:
            resource: namespace
      name:
        matches: ^(.*)
        as: "hpa_rq_jobs"
      metricsQuery: sum(rq_jobs{status='queued', scm_instance!=""}) by (scm_instance, queue)

We can query the available metrics, they "look good" apparently.

kubectl get --raw /apis/external.metrics.k8s.io/v1beta1/namespaces/scanners/hpa_rq_jobs| jq .
{
  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": {},
  "items": [
    {
      "metricName": "hpa_rq_jobs",
      "metricLabels": {
        "queue": "scmclone_repo",
        "scm_instance": "rq-exporter-bitbucket"
      },
      "timestamp": "2023-11-26T11:45:26Z",
      "value": "370"
    },
    {
      "metricName": "hpa_rq_jobs",
      "metricLabels": {
        "queue": "semgrep",
        "scm_instance": "rq-exporter-bitbucket"
      },
      "timestamp": "2023-11-26T11:45:26Z",
      "value": "0"
    }
  ]
}

From this point, we would expect the HPA for semgrep to keep with 2 nodes, but instead, it goes to 13 nodes which is the total number of records of the exposed metric.

            "metricSelector": {
              "matchLabels": {
                  "queue": "semgrep",
                  "scm_instance": "esi-bitbucket"
              }
            },

Appreciate any help on this.

Based on many documentation, Custom Metrics can be exposed and used at HPA. While autoscaling/v1 focus is CPU and Memory, while autoscaling/v2 allows custom and external metrics as well other specs.

From the links in the suggested answers, this one fits to the description. HPA labelSelector not filtering external metrics.

1

There are 1 answers

0
Daniel Szortyka On

The solution is at Prometheus Adapter custom/external metrics rules.

See the reply here: https://github.com/kubernetes-sigs/prometheus-adapter/issues/255#issuecomment-588471197

In short, even when providing filters in your metricsQuery, it needs the <<.LabelMatchers>>.

- seriesQuery: '{__name__=~"rq_jobs", namespace="scanners", scm_instance!="", status="queued"}'
      resources:
        overrides:
          namespace:
            resource: namespace
      name:
        matches: ^(.*)
        as: "hpa_rq_jobs"
      metricsQuery: sum(rq_jobs{status="queued", namespace="scanners", scm_instance!="", <<.LabelMatchers>>}) by (scm_instance, queue)