Databricks Invalid token when using proxy/no_proxy

74 views Asked by At

I have an Azure Machine Learning Compute Instance where I do pyspark code to a databricks cluster.

I have to use a proxy configuration from my company, but I don't need the proxy to access my databricks (an other ressources) since they are the in same subscription/ressource group/Spoke.

When I don't set my https_proxy, my query runs. But when is set my https_proxy with this no_proxy, it fails.

export no_proxy=export no_proxy=localhost,127.0.0.1,.if.atcsg.net,.adf.azure.com,.afs.azure.net,.agentsvc.azure-automation.net,.analysis.windows.net,.api.azureml.ms,.applicationinsights.azure.com,.azconfig.io,.azmk8s.io,.aznbcontent.net,.azure-api.net,.azure-automation.net,.azurecr.io,.azuredatabricks.net,.azure-devices.net,.azure-devices-provisioning.net,.azurehdinsight.net,.azurestaticapps.net,.azuresynapse.net,.azurewebsites.net,.backup.windowsazure.com,.batch.azure.com,.blob.core.windows.net,.cassandra.cosmos.azure.com,.cognitiveservices.azure.com,.database.windows.net,.datafactory.azure.net,.dev.azuresynapse.net,.developer.azure-api.net,.dfs.core.windows.net,.dicom.azurehealthcareapis.com,.digitaltwins.azure.net,.directline.botframework.com,.documents.azure.com,.europe.directline.botframework.com,.europe.token.botframework.com,.eventgrid.azure.net,.fhir.azurehealthcareapis.com,.file.core.windows.net,.gremlin.cosmos.azure.com,.guestconfiguration.azure.com,.his.arc.azure.com,.inference.ml.azure.com,.instances.azureml.ms,.kubernetesconfiguration.azure.com,.kusto.windows.net,.managedhsm.azure.net,.mariadb.database.azure.com,.media.azure.net,.mongo.cosmos.azure.com,.monitor.azure.com,.mysql.database.azure.com,.notebooks.azure.net,.ods.opinsights.azure.com,.oms.opinsights.azure.com,.openai.azure.com,.pbidedicated.windows.net,.postgres.database.azure.com,.prod.migration.windowsazure.com,.purview.azure.com,.queue.core.windows.net,.redis.cache.windows.net,.redisenterprise.cache.azure.net,.scm.azurewebsites.net,.search.windows.net,.service.batch.azure.com,.service.signalr.net,.servicebus.windows.net,.siterecovery.windowsazure.com,.sql.azuresynapse.net,.table.core.windows.net,.table.cosmos.azure.com,.tip1.powerquery.microsoft.com,.token.botframework.com,.vault.azure.net,.vaultcore.azure.net,.web.core.windows.net,.workspace.azurehealthcareapis.com,azure-automation.net,azurecr.io,database.windows.net,sql.azuresynapse.net

I have .azuredatabricks.net in my no_proxy which work for SQL warehouses queries, but not for pyspark using databricks clusters.

Any idea on which domain I have to set to my no_proxy for databricks-connect to communicate with my cluster?

Update

So I found a solution, that seems dirty, is it a good workaround, I'd rather know the correct no_proxy domain that this workaround.

import os
from pyspark import SparkConf

# Backup the current value of https_proxy
https_proxy_backup = os.environ.get("https_proxy", "")

# Temporarily remove the https_proxy variable
if "https_proxy" in os.environ:
    del os.environ["https_proxy"]

try:
    spark = SparkSession.builder.getOrCreate()
except:
    pass
finally:
    os.environ["https_proxy"] = https_proxy_backup

0

There are 0 answers