Databricks change data feed

61 views Asked by At

According to documentation change data feed on Databricks support both batch queries and streaming queries.

I was wondering if there is anyway to pass the parameters inside the batch queries.

More specifically, lets say I want to read the change feed starting from version 3. Actuall query will look like this =>

SELECT * FROM table_changes('tableName', 3)

But I want to pass this 3 parameter using some kind of variable which will be based on the table, so I query table, retrive the max(version) and that value should be 3.

My pseudo code looks like this:

SELECT * FROM table_changes("catalog.bronze.bronze_table", (SELECT MAX(version) FROM catalog.silver.silver_table))

How can I pass this 3 parameters inside the query using only the Databricks SQL and no spark at all.

1

There are 1 answers

0
JayashankarGS On

You can refer to the stack solution below, which shows how to use IDENTIFIER for column names, as it supports various DDL statements like CREATE, ALTER, etc.

azure - Parametrization in Databricks SQL - Stack Overflow

To get the max(version), you need to refer to the describe history statement as mentioned in this solution and use it to filter table_changes() results.

For more information, you can refer to this documentation.

IDENTIFIER clause

This applies to Databricks SQL and Databricks Runtime 13.2 and above.