Difference when you query from KDB HDB and KDB RDB

3.8k views Asked by At

I heard that there is a difference when you query (select, etc.) from HDB and RDB (in-memory) databases. Is it possible to describe all possible scenarios when we should use HDB specific queries and RDB specific queries and how to query: i.e. example of the query for HDB and the same example for RDB?

3

There are 3 answers

0
terrylynch On BEST ANSWER

Off the top of my head:

Partitioned HDB tables will have a "virtual" date column

RDB tables (generally) won't have a `date column

The virtual "i" column behaves differently for Partitioed HDB tables (http://code.kx.com/q/ref/dotq/#qind-partitioned-index)

HDB tables (unless stored flat/serialised) are not immediately pulled entirely into memory, data is read on demand

The symbol columns in HDB tables will be enumerated, in-memory RDB tables will be unenumerated

There's no real examples I can give but you just need to keep these in mind when querying

0
Manish Patel On

If you are using a vanilla rdb/hdb set up then this is the scenario:

  • tickeplant collects data for x millis and pumps to listeners
  • rdb is one such listener. It will hold data from midnight today to just before midnight tonight
  • at midnight tickerplant sends .u.end message
  • this invokes rdb to dump the in-memory table onto disk inside a 2014.12.19/ directory
  • note that the schema on an rdb is time, sym, then other columns. On hdb this switches to date (virtual), sym (p attr), time (sorted within sym)

So your where clause criteria is:

  • If you need to query data for today, it's rdb
  • anything before today, it's hdb
  • anything mixed, create functions on your hdb that pull data from rdb and join

The most optimal query for an rdb is always

select from table where time ...

because the rdb table is time sorted.

The most optimal query for an hdb is always

select from table where date=2014.12.24, sym=`AAPL, time ...

because it reduces seeks to disk (only needs to check data in 2014.12.24 directory), is sym parted and then time sorted within sym (altho technically there is no s attribute obviously). Having date as the first part of your where clause is very important! :)

2
Ramprasad On

So far I have encountered the following example queries which does not work in hdb.

  1. count tablename
  2. select [10] from tablename
  3. delete/update/insert statements have only temporary effect till hdb is restarted

i'll update this list as when i come across more