pyspark write to external hive cluster from databricks running on azure cloud

352 views Asked by At

I have pyspark notebooks running in databricks. I connect to an external hive cluster using 'hive.Connection' from pyhive. I have my data in spark dataframes. My question is how do I write this data from dataframes in a new table in Hive which resides in a different cluster other than databricks?

Thanks

1

There are 1 answers

1
CHEEKATLAPRADEEP On

Every Databricks deployment has a central Hive metastore accessible by all clusters to persist table metadata. Instead of using the Databricks Hive metastore, you have the option to use an existing external Hive metastore instance.

This article describes how to set up Azure Databricks clusters to connect to existing external Apache Hive metastores. It provides information about recommended metastore setup and cluster configuration requirements, followed by instructions for configuring clusters to connect to an external metastore.

You may check out this article about Securing Access To Shared Metastore With Azure Databricks.