KDB using function in where clause

4k views Asked by At

Is there a way in kdb to use functions in queries. assume I have a table with columns

`red`blue`green`yellow`white`purple

which contain either values 0 or 1. Instead of querying

select where ((red=1) or (green=1))

can I use a function like

isRG:{((select green from x)=1) or ((select red from x)=1)}

to filter a select? I can do

f:[select from t] 

and it returns a column with true and false, but I cant figure out how to do something along the lines of

select from t where f[select from t]

to get all entries where f(x) is true

5

There are 5 answers

0
MdSalih On BEST ANSWER

Yes - although you don't need to pass the entire table into the function, just the relevant columns:

/ define a table
tbl:flip`red`blue`green`yellow`white`purple!(0N;6)#36?0b;
    red blue green yellow white purple
    ----------------------------------
    1   0    0     0      1     1     
    1   0    0     0      0     1     
    1   0    0     0      0     0     
    0   0    1     0      0     0     
    1   1    0     0      0     0     
    0   0    0     0      1     0     

/ define a function to work on 2 columns - this func simply does or between the columns 
f:{[column1;column2] column1|column2};

/ use function on 2 columns of table
select from tbl where f[red;green]
    red blue green yellow white purple
    ----------------------------------
    1   0    0     0      1     1     
    1   0    0     0      0     1     
    1   0    0     0      0     0     
    0   0    1     0      0     0     
    1   1    0     0      0     0     

The key principles to keep in mind when using a function in a select where clause is:

  • Each column argument is passed in as a vector
  • The value returned by the function must be a vector of booleans (or boolean-equivalent type eg. integers, as 0 is considered false) of the same length
0
James Little On

It seems easier to work with a dictionary in this scenario; using flip on t below:

q)t
red blue green yellow white purple
----------------------------------
0   1    0     1      1     0
q)
q)(key[x:flip[t]] where (raze value x=1))#x
blue  | 1
yellow| 1
white | 1

enlist it if you want the result as a table:

q)enlist (key[x:flip[t]] where (raze value x=1))#x
blue yellow white
-----------------
1    1      1

Another possibility might be to use a functional select and filter the list of column names to those where all values are equal to 1.

0
notlightnorchroma On

Some variant on this might be interesting.

foo:{[t;c] t where (|) over flip c#t}

Note that c (column names) needs to be a list. Using tbl from a previous response:

foo[tbl;`red`blue]
0
SJT On

First, the columns are boolean vectors, so you can use them ‘raw’ in the where clause:

q)tbl
red blue green yellow white purple
----------------------------------
0   0    1     0      0     0
1   0    1     1      0     0
1   0    1     1      1     0
0   0    0     0      0     0
0   1    0     0      1     1
1   0    1     1      0     0
q)select from tbl where red or green
red blue green yellow white purple
----------------------------------
0   0    1     0      0     0
1   0    1     1      0     0
1   0    1     1      1     0
1   0    1     1      0     0

Can you use your own function in the where clause? Absolutely.

q)isRG:{or/[x`red`green]}
q)isRG tbl
111001b
q)select from tbl where isRG tbl
red blue green yellow white purple
----------------------------------
0   0    1     0      0     0
1   0    1     1      0     0
1   0    1     1      1     0
1   0    1     1      0     0

Going beyond your question, to make the column names arguments to your function, instead of writing a function to use in the q-SQL where clause, use functional select. In this, you express your constraint as a parse tree, e.g. (or;`red;`white)

q)?[tbl; enlist(or; `red; `white); 0b; ()]
red blue green yellow white purple
----------------------------------
1   0    1     1      0     0
1   0    1     1      1     0
0   1    0     0      1     1
1   0    1     1      0     0

You can then parameterise the column names:

q)selEither:{[t; c1; c2] ?[t; enlist(or; c1 ;c2); 0b; ()]}
q)selEither[tbl; `red; `white]
red blue green yellow white purple
----------------------------------
1   0    1     1      0     0
1   0    1     1      1     0
0   1    0     0      1     1
1   0    1     1      0     0

Lastly, you can extend this from a pair to a list of column names:

q)selAny:{[tbl; cn] ?[tbl; enlist(or/;enlist,cn); 0b; ()]}
q)selAny[t; `white`green`yellow]
…

See more at KX Technical Whitepaper Parse Trees and Functional Forms

0
nyi On

A slightly more generalized solution where you can specify the values for each column separately (`red`blue`green;1 1 0) and when the values are not boolean.

q)t:`red`blue`green`yellow`white`purple!/:(1 0 1 0 1 0;1 1 1 0 0 0;0 0 0 0 1 1)

/Create a 'where' clause dynamically 
q)df:{{(or;x;y)}over{(=;x 0;x 1)} each flip x}

/test 
q)df (`red`blue`green;1 1 0)
(|;(|;(=;`red;1j);(=;`blue;1j));(=;`green;0j))

/functional select 
f2:{[tab;cl;df]?[tab;enlist (df[cl];::);0b;()!()]}

/Projection using the df with f2 function 
f:f2[;;df]

/Calling the function 
f[t;(`red`blue`green;1 1 0)]
red blue green yellow white purple
----------------------------------
1   0    1     0      1     0
1   1    1     0      0     0
0   0    0     0      1     1