How can I get pure k-NN score when using a scoring script? OpenSearch

556 views Asked by At

I am using a scoring script to do prefiltering with exact k-NN. Here is how a sample query looks like:

GET /my_index/_search
{
  "query": {
    "script_score": {
      "query": {
        "bool": {
          "filter": {
            "bool": {
              "must": [
                {
                  "range": {
                    "price": {
                      "gte": 200,
                      "lte": 350
                    }
                  }
                }
              ]
            }
          }
        }
      },
      "script": {
        "source": "knn_score",
        "lang": "knn",
        "params": {
          "field": "my_vector",
          "query_value": [1.5, 5.5, 4.5, 6.4],
          "space_type": "cosinesimil"
        }
      }
    }
  }
}

Here is a sample of the response:

"max_score": 1.017859,
    "hits": [
      {
        "_index": "my_index",
        "_id": "1234",
        "_score": 1.017859,
        "_source": {
          "my_vector": [
            1.7,
            4.9,
            4.8,
            5.3
          ],
          "price": 250,
          "category": 376,
          "subcategory": 3265
        },
        ...
      }
      ...

How is the score computed here? Why is it over 1? After the prefiltering (scoring script) part, is there a way to get the similarity score for just the k-NN search? My use case is once the prefiltering is done, rank the documents based on just the k-NN score. How do I achieve that?

1

There are 1 answers

0
Adarsh Ghagta On

Opensearch adds 1 to the cosine similarity score for every document. The actual cosine similarity can be calculated by subtracting 1 from the score returned by the query. This is from the documentation: "Cosine similarity returns a number between -1 and 1, and because OpenSearch relevance scores can’t be below 0, the k-NN plugin adds 1 to get the final score."

https://opensearch.org/docs/latest/search-plugins/knn/knn-score-script/