Clickhouse- Search array of strings using inverted indexes

415 views Asked by At

I have a table with column type Array(String)

CREATE TABLE db.logs
( `timestamp` DateTime CODEC(Delta(4),
 ZSTD(1)),
    `message` String,    
    `source_type` LowCardinality(String),
    `labels_key` Array(String),
    `labels_value` Array(String),
     INDEX lk_inv_idx labels_key TYPE inverted GRANULARITY 1,
     INDEX lv_inv_idx labels_value TYPE inverted GRANULARITY 1
)
ENGINE = MergeTree
PARTITION BY timestamp
PRIMARY KEY (timestamp)
ORDER BY
(timestamp)
SETTINGS index_granularity = 8192;

array columns:
['color','make','year','type', 'interior']  ['red','toyota','2020','crossover', 'black']
['color','make','year','type', 'interior']  ['white','dodge','2023','pickup', 'black']
['color','make','year','type', 'interior']  ['red','audi','2021','sedan', 'red']
['color','make','year','type', 'interior']  ['yellow','bmw','2020','hatchback', 'yellow']

I want to search array columns using inverted index but when I search them using has() functions, it doesn't use the index and I can't use hasToken() (This is what I am using to search String columns with inverted array). Any ideas on how I can search using inverted index on Array(String) column. Thanks in advance.

1

There are 1 answers

1
Slach On BEST ANSWER

According to https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/mergetree#functions-support

has supported by inverted index

but you need to understand how indexes works in ClickHouse this is not indexes for search, when you firstly search in index and secondary search in main table space

this is DATA SKIP indexes, you skip granules (how many depends on GRANULARITY) inside data parts (system.parts) during filtering, so if your data spread by all granules, then secondary indexes are useless.

Look https://fiddle.clickhouse.com/df6acefe-5511-41e8-bd32-c3d1815d16d1

In output Index lk_inv_idx has dropped 0/2 granules. means, secondary index was used, but all granules contains color, so the index was useless.