I have 1M records in my db with such schema:
schema embeddings {
document embeddings {
field id type int {}
field text_embedding type tensor<double>(d0[960]) {
indexing: attribute | index
attribute {
distance-metric: euclidean
}
index {
hnsw {
max-links-per-node: 16
neighbors-to-explore-at-insert: 100
}
}
}
}
rank-profile closeness {
num-threads-per-search:1
inputs {
query(query_embedding) tensor<double>(d0[960])
}
first-phase {
expression: closeness(field, text_embedding)
}
}
}
My query for finding the nearest neighbors looks like this:
body = {
'yql': 'select * from embeddings where ({approximate:true, targetHits:100} nearestNeighbor(text_embedding, query_embedding));',
"hits":100,
'input': {
'query(query_embedding)': [...],
},
"ranking": {
"profile": "closeness",
"softtimeout": {
"enable": false
}
}
}
For some reasons, for certain vectors the number of results is smaller, than targetHits. Changing timeouts does not help.
Here is coverage section from the response:
"id": "toplevel",
"relevance": 1.0,
"fields": {
"totalCount": 39
},
"coverage": {
"coverage": 100,
"documents": 1000000,
"full": true,
"nodes": 1,
"results": 1,
"resultsFull": 1
},
Is there any way to receive exactly (or at least not less than) targetHits results (obviously there are enough results, since the closeness can be calculated for any other vector in the db)?
When you ask for
targetHits:100, Vespa will expose that to thefirst-phaseranking phase, per content node. If it does not, then we would be very interested in how to reproduce. That is best done by creating a issue over at github vespa-engine/vespa. There is also support for dropping hits infirst-phaseranking usingrank-score-drop-limit, which can reduce the result set andtotalCount. This does not seem to be enabled here.The
hitsparameter (orlimitin YQL) controls how many hits are returned in the response.Vespa's default timeout is 500ms, and if your system is heavily overloaded (or using exact search with
approximate:false), you might see soft-timeouts where Vespa returns a partial result. This situation is reflected in the returned resultcoverageelement.