First I ingest 2 mysql tables into kafka and then I loaded that kafka topics in druid.Now I have 2 datasources in druid in which one datasource includes upto 6000 records and another datasource includes upto 720000 records. I was trying to fetch records from both datasources using druid sql join query but it shows following error :
Resource limit exceeded / Subquery generated results beyond maximum[100000] /
org.apache.druid.query.ResourceLimitExceededException
Query :
SELECT * FROM unsold_all LEFT JOIN unsold_hotel_data ON unsold_all.booking_hotel_code=unsold_hotel_data.hotel_code
Any solution ?
To override the default limit on subquery size, change the
druid.server.http.maxSubqueryRows
configuration entry.https://druid.apache.org/docs/latest/configuration/index.html#server-configuration
However, doing so will impact the performance of your queries and is not recommended without caution.
Consider using filters appropriately to reduce the number of rows in the subquery. (1) add a time period
WHERE
clause instead of querying all data, and (2) add a GROUP BY to generate statistics rather than just doingSELECT *
.It wasn't clear in the OP so just checking that you are aware of the constraints on the left- and right-hand-side of the
JOIN
and that you had ingested the right data into the right data structures in Druid. https://druid.apache.org/docs/latest/querying/datasource.html#join