Scaledown is not happening for KEDA with AzureAKS

672 views Asked by At

We are using KEDA for autoscaling our AzureDevops agent in AKS cluster. We used scaledJob object for scaling purpose as SclaedObject deployment was showing unexpected behaviors while executing pipelines and was getting scaled down even when pipelines are getting executed.

The Below scaledjob resolved the unexpected behavior , however we are facing some concerns as below.

apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: azdevops-scaledjob
spec:
  jobTargetRef:
    template:
      spec:
        containers:
        - name: azdevops-agent-job
          image: vstsimage
          imagePullPolicy: Always
          env:
          - name: AZP_URL
            value: [MYAZPURL]
          - name: AZP_TOKEN
            value: [MYAZPTOKEN]
          - name: AZP_POOL
            value: [MYAZPPOOL]
          volumeMounts:
          - mountPath: /mnt
            name: storage
        volumes:
        - name: storage
          persistentVolumeClaim:
            claimName: azure-pvc
  pollingInterval: 30
  successfulJobsHistoryLimit: 5
  failedJobsHistoryLimit: 5
  maxReplicaCount: 10
  scalingStrategy:
    strategy: "default"
  triggers:
  - type: azure-pipelines
    metadata:
      poolID: "xxx"
      organizationURLFromEnv: "AZP_URL"
      personalAccessTokenFromEnv: "AZP_TOKEN"
  • we are using a Azure DevOps pool where we have vm based agents as well with this dockeragent pools. its noticed that scaleup happening with multiple replicas even though there are not much pipelines in que. how we can control this

  • The scaled own of the created jobs are not happening even when no pipelines are executing

  • Deleted scaled jobs from the cluster not removing the agent entry from the Azure DevOps agent pool.

1

There are 1 answers

0
Shayki Abramczyk On

Regarding issues 2 & 3:

You can specify the number of successfulJobsHistoryLimit and failedJobsHistoryLimit, in your case is 5, so 5 pods will always up, but to prevent them to get new pipelines you need to configure inside your start.sh script that the agent will run only one pipeline with the --once flag:

exec ./externals/node/bin/node ./bin/AgentService.js interactive --once & wait $!

After that, to remove agent entry from the pool you need to add it also to the start.sh script (in the end of the script):

./config.sh remove --unattended \
  --auth PAT \
  --token $(cat "$AZP_TOKEN_FILE")