pyspark-connect can't show all hive databases

58 views Asked by At

I'm using pyspark3.4.0 feature spark-connet module to connect remote hive 3.1.3.

When create sparksession in local mode with hive supported, all data base in hive can be viewed;

spark = SparkSession.builder.enableHiveSupport().master("local").getOrCreate()
spark.sql("show databases")

But, when I tried to use spark-connect, only default database was showed.

spark = SparkSession.builder.enableHiveSupport().remote("sc://localhost:15002").getOrCreate()
spark.sql("show databases")

I'm expected it show all the databases,so as to select/add data.

I have copied 'hive-site.xml' to $SPARK_HOME/conf.

1

There are 1 answers

0
Vijay_Shinde On

There could be several reasons for 'pyspark-connect' not displaying all Hive databases in PySpark. Here are a few possible issues that you might need to check:

Permissions: The user account being used to connect to Hive may not have sufficient permissions to access all the databases. Ensure that the user has the necessary privileges to view all the databases.

Configuration: Verify the configuration settings for connecting to Hive in 'pyspark-connect'. Ensure that the correct Hive metastore URI and other relevant properties are configured properly.

Hive metastore synchronization: It's possible that the Hive metastore is not properly synchronized with the databases. Try refreshing or updating the metastore to ensure it reflects the latest database changes.