I've been reading over the docs for multi-vector queries in Cognitive Search.
I was thinking of adding a feature to our platform to search content 'by example(s)'. Where a developer using our platform could select one or more existing content items (which have embeddings stored in Cog Search), and then we do a vector search providing those embeddings vectors.
Basically I would want to have multiple input vectors, but a single vector field in the index, so it would return documents similar to all the input vectors.
(Example, developer selects three ingested web pages which have embeddings created, and wants to find other content which is similar to all three web pages.)
Would a multi-vector query work for this, or is that only for searching across different vector fields?
Regarding the technical feasibility of having "multiple input vectors searching over a single field in the index": This is perfectly ok; we have no restrictions on the vector fields contained within a vector query besides total vector field count. You would structure the query component "vectorQueries" (which is an array) to contain multiple individual "vector query", each with the "searchField" set to the same vector field. Each of those queries will return a recall set, which will be fused by us using RRF to form the final unified set. (Note the scores would be RRF values which you can't threshold on)
Note that the total vector field count among all cross-field (a vector query with one vector targeting multiple search fields) or multi-vector (more than one vector query) in a single search request can't exceed 10 (counting fields that appear in multiple vector queries).
Regarding whether this will accomplish your goal: If your goal is to find content similar to all query vectors, then this scenario would work with some caveats. Note that RRF will tend to rank a document that's moderately ranked but appears in multiple recall sets to be higher than a document that is highly ranked in fewer recall sets. This behavior may require some experimentation from you to decide if it's acceptable. In this particular situation, since scores in all recall sets are directly comparable (same similarity metric), you may want to use an aggregation function such as sum, avg, or max that will produce a cosine similarity value as the output to use with thresholding, instead of the RRF scores. You'll need to do this yourself at the moment by issuing each vector query separately and fusing the recall sets in your app.
On a related note, say you want to find content that matches any of the query vectors. This scenario may not work well because even though RRF fusion would return the union of all recall sets, it would have the same ranking concern above. In this scenario you might wish to try issuing each single vector query in your app and fusing the recall sets with duplicate doc ids taking the max similarity score of any recall sets containing it.
One key question you'll need to ask yourself is how documents that appear in multiple recall sets should be ranked relative to docs that appear in fewer, and whether magnitude should play a role.