Create database spark sql

1.7k views Asked by At

I'm using spark 2.4.4 with AWS glue catalog.

In my spark job, I need to create a database in glue if it doesn't exist. I'm using the following statement in spark sql to do so.

spark.sql("CREATE DATABASE IF NOT EXISTS %s".format(hiveDatabase));

It works as expected in spark-shell, a database gets create in Glue. But when I run the same piece of code using spark-submit, then the database is not created. Is there a commit/flush that I need to do when using spark-submit?

EDIT I'm getting different results for show databases in spark-shell and spark-submit:

+---------------------+
|databaseName         |
+---------------------+
|all                  |
|default              |
|hive-db              |
|navi-database-account|
|navi-par             |
|testdb               |
+---------------------+


+------------+
|databaseName|
+------------+
|default     |
+------------+

Looks like spark-submit is creating the DB somewhere, but not in glue.

1

There are 1 answers

0
nish On BEST ANSWER

Needed to add following config:

("spark.sql.catalogImplementation", "hive")