Can't authenticate deploy of Databricks bundle in Azure pipeline using service principal

817 views Asked by At

Issue Trying to deploy a Databricks bundle within an Azure pipeline.

  • Databricks CLI = v0.209.0
  • Bundle artifact is downloaded to the vm correctly. Conducted via these instructions: (https://learn.microsoft.com/en-us/azure/databricks/dev-tools/ci-cd/ci-cd-azure-devops)
  • Service principal is being used for authentication. It's access is verified by its ability to provision a databricks workspace in Terraform. Also, its credentials are sufficient to generate a Microsoft Entra ID
  • The task I am trying to authenticate is databricks bundle deploy

Approaches Attempted to set a .databrickscfg file using the default syntax. Using an Entra ID as the token, I attempted

- script: |
    echo "[DEFAULT]                                                               
    host = $url
    token = $token" > ~/.databricks.cfg

But received this error when trying to deploy the bundle: Error: default auth: cannot configure default credentials. Config: host=https://<databricks-host>. Env: DATABRICKS_HOST

I also attempted the above, used my service principal client secret for the token instead of an Entra ID, this produced this error:

Error: Get "https://<databricks-host>/api/2.0/preview/scim/v2/Me": dial tcp: lookup <databricks-host>: no such host

-This almost seemed like a promising improvement, but it was very odd it could not find the host I had just used to generate an Entra ID

Last, I attempted to format the databrickcfg file for service principals (as suggested here: https://learn.microsoft.com/en-us/azure/databricks/dev-tools/cli/authentication#--azure-service-principal-authentication). For this I attempted:

      - script: |
          echo "[SP]                                                               
          host = $DATABRICKS_TARGET_HOST
          azure_tenant_id = $SP_TENANT_ID
          azure_client_id = $SP_CLIENT_ID
          azure_client_secret = $SP_CLIENT_SECRET" > /home/vsts/.databrickscfg

Then I attempted to force databricks to use this profile with: databricks bundle deploy -p SP but received panic: config host mismatch: profile uses host ***, but CLI configured to use <databricks-host>

In all cases, when I run databricks auth profiles, I am told all my profiles are Valid=No. Trying to deploy without first configuring databricks also fails.

I am aware of the posted solution: https://learn.microsoft.com/en-us/azure/databricks/dev-tools/ci-cd/ci-cd-github.

  • This requires me to use install the Environment Variables Azure Dev Ops extension, but my organization will not approve any extensions of this kind

I am also aware of the solution involving github actions:(https://learn.microsoft.com/en-us/azure/databricks/dev-tools/ci-cd/ci-cd-github)

  • I don't want to pursue this option because I do not believe my organization will approve and I must do the bulk of my work with Azure Pipelines

What is the correct way (if one exists) to authenticate a databricks bundle deployment within an Azure pipeline yaml?

1

There are 1 answers

0
hce On

@Lorikiki Hi,

I am not sure, but it might be the following.

Databricks CLI will prioritize environment variables (DATABRICKS_HOST, DATABRICKS_TOKEN) over the contents of the .databrickscfg file. Try the following

steps:
    - script: |
        echo "[DEFAULT]
        host = $(DATABRICKS_HOST)
        token = $(DATABRICKS_TOKEN)" > ~/.databrickscfg
      displayName: 'Configure .databrickscfg'
    

PS:

DATABRICKS_HOST, which represents the per-workspace URL of your Azure Databricks workspace, beginning with https://, for example https://adb-<workspace-id>.<random-number>.azuredatabricks.net. Do not include the trailing / after .net.

DATABRICKS_TOKEN, which represents your Azure Databricks personal access token or Microsoft Entra ID token.