I am trying to write dataframe data into a table in Exasol from AWS Databricks using pyspark while truncating the data before the overwrite. I am using the follow command:
`_df.write.format("jdbc")
.option("url", "**********")
.option("dbtable", dbtable)
.option("user", username)
.option("password", password)
.option("truncate", "true")
.option("driver", "com.exasol.jdbc.EXADriver")
.mode("overwrite") # You can use "overwrite", "append", "ignore", or "error"
.save()`
This results in the table on Exasol to be dropped. If there is a solution that could send a truncate command using this jdbc connection prior to write (in append mode) that could also work.
The cluster and Exasol driver details are:
13.3 LTS (includes Apache Spark 3.4.1, Scala 2.12)
Exasol driver version: 7.1.20 (also tried 7.1.16 with same results)
I found this to be the most relevant but this is targeted towards Azure Databricks: Table gets deleted when trying to overwrite the data in it from databricks spark
This was posted in the Dababricks community but I also did not see a solution for it: https://community.databricks.com/t5/data-engineering/why-spark-save-modes-quot-overwrite-quot-always-drops-table/td-p/12749#:~:text=Overwrite%20is%20enabled%2C%20this%20option,data%20has%20a%20different%20schema.