Table gets deleted when trying to overwrite the data in it from databricks spark

Question

Table gets deleted when trying to overwrite the data in it from databricks spark

516 views Asked by Ramaraju.d At 16 June 2023 at 04:11

I am trying to write dataframe data into a table in Azure SQL from Databricks using pyspark. Table : dbo.test already exists in the database. I am able to read it before I execute below write operation.

testDf.write.format("com.microsoft.sqlserver.jdbc.spark").mode("overwrite")\
        .option("truncate", "false")\
        .option("url", azure_sql_url).option("dbtable", 'dbo.test')\
        .option("databaseName", database_name)\
        .option("user", username) \
        .option("password", password) \
        .option("encrypt", "true")\
        .option("hostNameInCertificate", "*.database.windows.net")\
        .option("bulkCopyBatchSize", 10000).option("bulkCopyTableLock", "true")\
        .option("bulkCopyTimeout", "6000000")\
        .save()

After executing this command the following error is returned:

java.lang.NoSuchMethodError: org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.schemaString(Lorg/apache/spark/sql/Dataset;Ljava/lang/String;Lscala/Option;)Ljava/lang/String;

Surprisingly, the dbo.test table gets deleted.

Can someone help me understand why this is happening. Same code works fine in another environment.

Original Q&A

There are 1 answers

**Pratik Lad** · Accepted Answer · 2023-06-16T06:14:52+00:00

java.lang.NoSuchMethodError: org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.schemaString(Lorg/apache/spark/sql/Dataset;Ljava/lang/String;Lscala/Option;)Ljava/lang/String;

The cause of error is version mismatch between the Spark SQL and JDBC spark connector.Also, The mode("overwrite") drops the table if already exists by default and re-creates a new one.

Here are some supported versions of JDBC spark connector ith respective spark version:

Spark 2.4.x : com.microsoft.azure:spark-mssql-connector:1.0.2
Spark 3.0.x : com.microsoft.azure:spark-mssql-connector_2.12:1.1.0
Spark 3.1.x : com.microsoft.azure:spark-mssql-connector_2.12 :1.2.0

Resolution:

If you are using Databricks runtime version greater than 10 then you need to downgrade it to Databricks runtime version 9.1 LTS or down. and use respective connector.

I downgraded Databricks runtime version to 7.3 LTS:
Then installed appropriate library for spark 3.0.x which is com.microsoft.azure:spark-mssql-connector_2.12:1.1.0
And tried your code its working fine.

df_name.write.format("com.microsoft.sqlserver.jdbc.spark").mode("overwrite")\
        .option("truncate", "false")\
        .option("url", "Azure_sql_url").option("dbtable", 'dbo.test')\
        .option("databaseName", "databasename")\
        .option("user", "username") \
        .option("password", "password") \
        .option("encrypt", "true")\
        .option("hostNameInCertificate", "*.database.windows.net")\
        .option("bulkCopyBatchSize", 10000).option("bulkCopyTableLock", "true")\
        .option("bulkCopyTimeout", "6000000")\
        .save()

My execution:

enter image description here

OUTPUT:

enter image description here

Or you can directly use dataframename.format("jdbc")

Sample CODE:

df_name.write.format("jdbc").mode("overwrite")\
        .option("truncate", "false")\
        .option("url", "Azure_sql_url").option("dbtable", 'dbo.test')\
        .option("databaseName", "databasename")\
        .option("user", "username") \
        .option("password", "password") \
        .option("encrypt", "true")\
        .option("hostNameInCertificate", "*.database.windows.net")\
        .option("bulkCopyBatchSize", 10000).option("bulkCopyTableLock", "true")\
        .option("bulkCopyTimeout", "6000000")\
        .save()

You can also refer similar issue here on Github or this similar SO thread

Reference: https://learn.microsoft.com/en-us/sql/connect/spark/connector?view=sql-server-ver16

TechQA.

Table gets deleted when trying to overwrite the data in it from databricks spark

There are 1 answers

Related Questions in SQL-SERVER

Related Questions in PYSPARK

Related Questions in APACHE-SPARK-SQL

Related Questions in AZURE-SQL-DATABASE

Related Questions in SPARK-JDBC

Popular Questions

Trending Questions