Situation
I want to write an Exasol UDF that calculates scoring models with different parameters. It should also be possible to specify a model formula, e.g. Treat ~ Sex + Age + Income
Unfortunately, the column names of the input table are not accessible within the UDF. Although the official EXASOL documentation says so:
exa$meta$input_columns[]
Array including the following information: {name, type, sql_type, precision, scale, length}
The columns are simply numbered consecutively, as shown in the following reproducible example:
Reprex
--/
CREATE OR REPLACE R SET SCRIPT TEST_SCHEMA.TEST_COLUMNNAMES(...) EMITS(COLNAMES CHAR(2000)) AS
run <- function(ctx) {
ctx$next_row(NA)
numCols <- exa$meta$input_column_count
inCols <- exa$meta$input_columns[1:numCols]
ctx$emit(inCols)
}
/
SELECT TEST_SCHEMA.TEST_COLUMNNAMES(CLUSTER_NAME, MEASURE_TIME, EVENT_TYPE, DBMS_VERSION, NODES, DB_RAM_SIZE, PARAMETERS) FROM EXA_STATISTICS.EXA_SYSTEM_EVENTS limit 10;
which results in:
COLNAMES |
---|
list(name = "0", type = "character", sql_type = "VARCHAR(128) UTF8", precision = NA, scale = NA, length = 128) |
list(name = "1", type = "POSIXt", sql_type = "TIMESTAMP", precision = NA, scale = NA, length = NA) |
list(name = "2", type = "character", sql_type = "VARCHAR(30) UTF8", precision = NA, scale = NA, length = 30) |
list(name = "3", type = "character", sql_type = "VARCHAR(30) UTF8", precision = NA, scale = NA, length = 30) |
list(name = "4", type = "integer", sql_type = "DECIMAL(4,0)", precision = 4, scale = 0, length = NA) |
list(name = "5", type = "numeric", sql_type = "DECIMAL(10,1)", precision = 10, scale = 1, length = NA) |
list(name = "6", type = "character", sql_type = "VARCHAR(100000) UTF8", precision = NA, scale = NA, length = 100000) |
Question
Is there a way to access the column names of the input table in a UDF?
The example is an R-UDF, but I assume that this is also true for the other UDF scripting languages (Python, Java).
A simple test
suggests that
exa.meta.input_columns[]
shows UDF parameters, not the names of the really passed arguments.To be sure one might want to check the code in https://github.com/exasol/script-languages-release/tree/master. However, I doubt that runtime names could be accessible from inside the UDF. In particular, it is possible to call a UDF with constants like
test.col_names('42')
...