According to documentation change data feed on Databricks support both batch queries and streaming queries.
I was wondering if there is anyway to pass the parameters inside the batch queries.
More specifically, lets say I want to read the change feed starting from version 3. Actuall query will look like this =>
SELECT * FROM table_changes('tableName', 3)
But I want to pass this 3 parameter using some kind of variable which will be based on the table, so I query table, retrive the max(version) and that value should be 3.
My pseudo code looks like this:
SELECT * FROM table_changes("catalog.bronze.bronze_table", (SELECT MAX(version) FROM catalog.silver.silver_table))
How can I pass this 3 parameters inside the query using only the Databricks SQL and no spark at all.
You can refer to the stack solution below, which shows how to use
IDENTIFIERfor column names, as it supports various DDL statements like CREATE, ALTER, etc.azure - Parametrization in Databricks SQL - Stack Overflow
To get the
max(version), you need to refer to the describe history statement as mentioned in this solution and use it to filtertable_changes()results.For more information, you can refer to this documentation.
IDENTIFIER clause
This applies to Databricks SQL and Databricks Runtime 13.2 and above.