Get the phrase found by elastic search in a complex bool query

41 views Asked by At

I have a pretty complex query that is running in the search API (elastic python client search API) for very large amount of phrases with some other constriens

            query = {
            "query": {
                "bool": {
                    "should": [
                        {
                            "bool": {
                                "must": [
                                    {"terms": {"page_id": chunked_pages[entity]}},
                                    {
                                        "bool": {
                                            "should": [
                                                {"match_phrase": {"content": {"query": name, "slop": 6}}}
                                                for name in chunked_names[entity]
                                            ]
                                        }
                                    },
                                ]
                            }
                        }
                        for entity in chunked_names.keys()
                    ],
                    "minimum_should_match": 1,
                }
            },
            "highlight": {
                "fields": {
                    "content": {}
                },
                "pre_tags": ["<em>"],
                "post_tags": ["</em>"],
            },
            "from": from_param,
            "size": results_per_request
        } 
response = es.search(index=index_name, body=query)

And for each retrieved document I would like to know what phrase has been found there (since there are thousands of potential phrases ). I tried using the highlight but I am getting outputs that suggest that the highlight feature is mixing between bool clauses, while the document is correct, the highlight terms are not related (breaking the page_ids constraints)

Any idea how to deal with it?

0

There are 0 answers