Distributed memory table in Clickhouse

35 views Asked by Mohan Parthasarathy At 04 March 2024 at 05:24

I want to be able to store select query results from a Distributed ReplicatedMergeTree on to a memory table temporarily to run some aggregations. I decided to go with a "Memory" engine where the results from each node would be cached locally. I need a "Distributed" memory table which was not clear whether it was supported or not by reading the documentation. I gave it a try.

I am able to create a memory table in the cluster:

CREATE TABLE IF NOT EXISTS table_local ON CLUSTER '{cluster}' { ...} ENGINE = Memory

I want to be able to use a Distributed table on top of this:

CREATE TABLE IF NOT EXISTS table_distributed ON CLUSTER '{cluster}' AS table_local ENGINE = Distributed('{cluster}', default, table_local, rand())

This command successfully completes. I had to use default as I am not sure what the right value here for Memory table would be. I am also able to insert data to "table_distributed".

INSERT INTO table_distributed select * from <some_other_table>

But the subsequent "select" query itself succeeds only from one of the nodes. If i send my query to the load balancer in the front, it fails. If I send this to one specific pod address, it succeeds. Can someone clarify whether this is supported ? This seems like a useful case if i want to run a bunch of aggregations on some filtered results from a table. I have tried temporary table also and does not work.

Original Q&A

TechQA.

Distributed memory table in Clickhouse

There are 0 answers

Related Questions in MEMORY

Related Questions in DISTRIBUTED

Related Questions in CLICKHOUSE

Popular Questions

Trending Questions