Approximate nearest neighbors search returns too few results

153 views Asked by At

I have 1M records in my db with such schema:

schema embeddings {
  document embeddings {
    field id type int {}
    field text_embedding type tensor<double>(d0[960]) {
      indexing: attribute | index
      attribute {
        distance-metric: euclidean
      }
      index {
          hnsw {
              max-links-per-node: 16
              neighbors-to-explore-at-insert: 100
          }
      }
    }
  }

  rank-profile closeness {
    num-threads-per-search:1
    inputs {
      query(query_embedding) tensor<double>(d0[960])
    }
    first-phase {
      expression: closeness(field, text_embedding)
    }
  }
}

My query for finding the nearest neighbors looks like this:

body = {
    'yql': 'select * from embeddings where ({approximate:true, targetHits:100} nearestNeighbor(text_embedding, query_embedding));',
    "hits":100,
    'input': {
        'query(query_embedding)': [...],
    },
    "ranking": {
        "profile": "closeness",
        "softtimeout": {
            "enable": false
        }
    }
}

For some reasons, for certain vectors the number of results is smaller, than targetHits. Changing timeouts does not help.

Here is coverage section from the response:

"id": "toplevel",
"relevance": 1.0,
"fields": {
    "totalCount": 39
},
"coverage": {
    "coverage": 100,
    "documents": 1000000,
    "full": true,
    "nodes": 1,
    "results": 1,
    "resultsFull": 1
},

Is there any way to receive exactly (or at least not less than) targetHits results (obviously there are enough results, since the closeness can be calculated for any other vector in the db)?

1

There are 1 answers

2
Jo Kristian Bergum On

When you ask for targetHits:100, Vespa will expose that to the first-phase ranking phase, per content node. If it does not, then we would be very interested in how to reproduce. That is best done by creating a issue over at github vespa-engine/vespa. There is also support for dropping hits in first-phase ranking using rank-score-drop-limit, which can reduce the result set and totalCount. This does not seem to be enabled here.

The hits parameter (or limit in YQL) controls how many hits are returned in the response.

Vespa's default timeout is 500ms, and if your system is heavily overloaded (or using exact search with approximate:false), you might see soft-timeouts where Vespa returns a partial result. This situation is reflected in the returned result coverage element.