ArgoCD Cluster Registration Authentication Fail: argocd-k8s-auth failed with exit code 20

94 views Asked by At

I am currently setting up ArgoCD in a hub model to manage deployments across multiple AKS clusters. I've chosen to use Managed Identities for authentication as outlined here: https://argo-cd.readthedocs.io/en/stable/operator-manual/declarative-setup/#aks

Despite successfully configuring the necessary roles, permissions, and annotations, and verifying that the argocd-k8s-auth command manually retrieves tokens within the ArgoCD pod, the deployment operation fails with an error: getting credentials: exec: executable argocd-k8s-auth failed with exit code 20.

The relevant portion of the error log is as follows:

ComparisonError: Failed to load live state: failed to get cluster info for "https://xxxx-dn4ng0go.hcp.westeurope.azmk8s.io:443": error synchronizing cache state: Get "https://xxxx-dn4ng0go.hcp.northeurope.azmk8s.io:443/version?timeout=32s": getting credentials: exec: executable argocd-k8s-auth failed with exit code 20

Steps Taken

  1. Assigned Azure Kubernetes Service RBAC Cluster Admin role to the ArgoCDManagedIdentity.
  2. Annotated the argocd service account with the Managed Identity's client ID and tenant ID.
  3. Confirmed environment variables for Azure Managed Identity are correctly injected into the ArgoCD pods.
  4. Manually tested argocd-k8s-auth within the pod, successfully retrieving a token.

In fact, if I opted to use SPN as outlined in the aforementioned ArgoCD documentation link - the cluster registration is successful as well as deployment. Here are the configuration of those secrets, one uses SPN and another uses workload identity.

apiVersion: v1
kind: Secret
metadata:
  name: success-cluster
  namespace: argocd
  labels:
    argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
  name: success-cluster
  server: <URL>
  config: |
    {
      "execProviderConfig": {
        "command": "argocd-k8s-auth",
        "env": {
          "AAD_ENVIRONMENT_NAME": "AzurePublicCloud",
          "AAD_SERVICE_PRINCIPAL_CLIENT_SECRET": "<SECRET>",
          "AZURE_TENANT_ID": "<TENANT_ID>",
          "AAD_SERVICE_PRINCIPAL_CLIENT_ID": "<CLIENT_ID>",
          "AAD_LOGIN_METHOD": "spn"
        },
        "args": ["azure"],
        "apiVersion": "client.authentication.k8s.io/v1beta1"
      },
      "tlsClientConfig": {
        "insecure": false,
        "caData": "<CA_DATA>"
      }
    }
---
apiVersion: v1
kind: Secret
metadata:
  name: fail-cluster
  namespace: argocd
  labels:
    argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
  name: fail-cluster
  server: <URL>
  config: |
    {
      "execProviderConfig": {
        "command": "argocd-k8s-auth",
        "env": {
          "AAD_ENVIRONMENT_NAME": "AzurePublicCloud",
          "AZURE_CLIENT_ID": "<MID_CLIENT_ID>",
          "AZURE_TENANT_ID": "<TENANT_ID>",
          "AZURE_FEDERATED_TOKEN_FILE": "/var/run/secrets/azure/tokens/azure-identity-token",
          "AZURE_AUTHORITY_HOST": "https://login.microsoftonline.com/",
          "AAD_LOGIN_METHOD": "workloadidentity"
        },
        "args": ["azure"],
        "apiVersion": "client.authentication.k8s.io/v1beta1"
      },
      "tlsClientConfig": {
        "insecure": false,
        "caData": "<CA_DATA>"
      }
    }

Both SPN, and the managed identity are assigned the same permissions on the target cluster, for extra measures I elevated the mid's permissions to 'Owner', and validated the federation created successfully between the argocd-server service account and the identity.

What am I missing?

0

There are 0 answers