I want to be able to store select query results from a Distributed ReplicatedMergeTree on to a memory table temporarily to run some aggregations. I decided to go with a "Memory" engine where the results from each node would be cached locally. I need a "Distributed" memory table which was not clear whether it was supported or not by reading the documentation. I gave it a try.
I am able to create a memory table in the cluster:
CREATE TABLE IF NOT EXISTS table_local ON CLUSTER '{cluster}' { ...} ENGINE = Memory
I want to be able to use a Distributed table on top of this:
CREATE TABLE IF NOT EXISTS table_distributed ON CLUSTER '{cluster}' AS table_local ENGINE = Distributed('{cluster}', default, table_local, rand())
This command successfully completes. I had to use default as I am not sure what the right value here for Memory table would be. I am also able to insert data to "table_distributed".
INSERT INTO table_distributed select * from <some_other_table>
But the subsequent "select" query itself succeeds only from one of the nodes. If i send my query to the load balancer in the front, it fails. If I send this to one specific pod address, it succeeds. Can someone clarify whether this is supported ? This seems like a useful case if i want to run a bunch of aggregations on some filtered results from a table. I have tried temporary table also and does not work.