Druid : org.apache.druid.query.ResourceLimitExceededException

2.9k views Asked by At

First I ingest 2 mysql tables into kafka and then I loaded that kafka topics in druid.Now I have 2 datasources in druid in which one datasource includes upto 6000 records and another datasource includes upto 720000 records. I was trying to fetch records from both datasources using druid sql join query but it shows following error :

Resource limit exceeded / Subquery generated results beyond maximum[100000] /
org.apache.druid.query.ResourceLimitExceededException

Query :

SELECT * FROM unsold_all LEFT JOIN unsold_hotel_data ON unsold_all.booking_hotel_code=unsold_hotel_data.hotel_code

Any solution ?

1

There are 1 answers

3
Peter Marshall On

To override the default limit on subquery size, change the druid.server.http.maxSubqueryRows configuration entry.

https://druid.apache.org/docs/latest/configuration/index.html#server-configuration

However, doing so will impact the performance of your queries and is not recommended without caution.

Consider using filters appropriately to reduce the number of rows in the subquery. (1) add a time period WHERE clause instead of querying all data, and (2) add a GROUP BY to generate statistics rather than just doing SELECT *.

It wasn't clear in the OP so just checking that you are aware of the constraints on the left- and right-hand-side of the JOIN and that you had ingested the right data into the right data structures in Druid. https://druid.apache.org/docs/latest/querying/datasource.html#join