I have a few million documents in my index.
I have a sentence and want to retrieve the document that matches as many words. I need to search only one field content
curl -X GET "xxx.com:9200/test/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query" : {
"bool" : { "must" : [{"term": {"content": {"value": "popular artworks of Banksy"}}}]
}}
}
'
I want the document which has as many words from the query and more the better. If there is a document with text that has many occurrences of artwork, Banksy, and a few popular - it should be scored high. Additionally, is it possible to give less weight to a match to a word that occurs more commonly than others? Like more weight to popular than Banksy. I understand that I could use boost. But I don't want to set these values manually. I want it to have an implicit understanding if possible.
Adding a working example with index data, search query, and search result.
Refer ES documentation on match_phrase query and bool queries to get a detailed explanation.
Index Data:
Search Query:
Search Result: