I'm using Elasticsearch DSL, I'm trying to use a query result as a parameter for another query like below:
{
"query": {
"bool": {
"must_not": {
"terms": {
"request_id": {
"query": {
"match": {
"processing.message": "OUT Followup Synthesis"
}
},
"fields": [
"request_id"
],
"_source": false
}
}
}
}
}
}
As you can see above I'm trying to search for sources that their request_id
is not one of the request_ids
with processing.message
equals to OUT Followup Synthesis
.
I'm getting an error with this query:
Error loading data [x_content_parse_exception] [1:1660] [terms_lookup] unknown field [query]
How can I achieve my goal using Elasticsearch DSL?
Original question extracted from the comments
Answer: generally speaking, neither application-side joins nor subqueries are supported in Elasticsearch.
So you'll have to run your first query, take the retrieved IDs and put them into a second query — ideally a
terms
query.Of course, this limitation can be overcome by "hijacking" a scripted metric aggregation.
Taking these 3 documents as examples:
you could run
which'd return only the correct request:
⚠️ This is almost guaranteed to be slow and goes against the suggested guidance of not accessing the
_source
field. But it also goes to show that subqueries can be "emulated".I'd recommend to test this script on a smaller set of documents before letting it target your whole index — maybe restrict it through a date
range
query or similar.FYI Elasticsearch exposes an SQL API, though it's only offered through X-Pack, a paid offering.