I have a search which has multiple criterion.
Each criterion (grouped by should) has a different weighted score.
ElasticSearch returns a list of results; each with a score - which seems an arbitrary score to me. This is because I can't find a denominator for that score.
My question is - how can I represent each score as a ratio?
Dividing each score by max_score would not work since it'll show the best match as a 100% match with the search criteria.
The
_scorecalculation depends on the combination of queries used. For instance, a simple query like:would use Lucene's TFIDFSimilarity, combining:
term frequency (TF): how many times does the term
searchappear in thetitlefield of this document? The more often, the higher the scoreinverse document frequency (IDF): how many times does the term
searchappear in thetitlefield of all documents in the index? The more often, the lower the scorefield norm: how long is the
titlefield? The longer the field, the lower the score. (Shorter fields liketitleare considered to be more important than longer fields likebody.)A query normalization factor. (can be ignored)
On the other hand, a
boolquery like this:would calculate the
_scorefor each clause which matches, add them together then divide by the total number of clauses (and once again have the query normalization factor applied).So it depends entirely on what queries you are using.
You can get a detailed explanation of how the
_scorewas calculated by adding theexplainparameter to your query:Without understanding what you want your query to do it is impossible to answer this. Depending on your use case, you could use the
function_scorequery to implement your own scoring algorithm.