Unable to read data from ADLS gen 2 in Azure Databricks

33 views Asked by At

I have followed this Microsoft Documentation to connect to my gen2 storage account: https://learn.microsoft.com/en-gb/azure/databricks/connect/storage/tutorial-azure-storage

and used this to authenticate according to step 6:

service_credential = dbutils.secrets.get(scope="<scope>",key="<service-credential-key>")

spark.conf.set("fs.azure.account.auth.type.<storage-account>.dfs.core.windows.net", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type.<storage-account>.dfs.core.windows.net", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.id.<storage-account>.dfs.core.windows.net", "<application-id>")
spark.conf.set("fs.azure.account.oauth2.client.secret.<storage-account>.dfs.core.windows.net", service_credential)
spark.conf.set("fs.azure.account.oauth2.client.endpoint.<storage-account>.dfs.core.windows.net", "https://login.microsoftonline.com/<directory-id>/oauth2/token")

Now when I am running this:

df = spark.read.csv("abfss://<filepath>")

I am getting this error: abfss://filepath has invalid authority.

I have double checked :

  1. tenant id of the SP
  2. client id of the SP
  3. secret scope name created according to the above mentioned documentation
  4. The role of the service principal in the container is "Storage Blob data Contributor"

File Service properties of my storage account:

Large file share Disabled

Identity-based access Not configured

Default share-level permissions Disabled

Soft delete Enabled (7 days)

Share capacity 5 TiB

1

There are 1 answers

0
Tahmeed On

Scope for SP didn't work even though the SP had "Storage Blob Data Contributor" role. So I tried creating a scope for my container's access key and it worked without any issues. Not sure exactly what the issue was though with the SP. I used this:

spark.conf.set(f"fs.azure.account.key.<container>.blob.core.windows.net", dbutils.secrets.get("scope-name", "secret-name"))

df = spark.read.csv(f"wasbs://container-name@sa_name.blob.core.windows.net/filepath")