using
connections.connect("default", host=cfg.db.connection.host, port=cfg.db.connection.port)
collection = Collection(name="ibss")
I can connect to Milvus database and select the collection.
using
query_vector = SentenceTransformerEmbeddings().embed_query(SOME_TEXT_HERE)
search_params = {
"metric_type": "L2",
"params": {"nprobe": 10},
}
results = collection.search(data=[query_vector], param=search_params, anns_field = "vector", limit = 3, output_fields = [])
# the question is how to get the text out of ID
# that can be used to remove
for hits in results:
for hit in hits:
print(hit)
break
I can get some output like
id: 445499747977925765, distance: 0.24354277551174164, entity: {}
it turned out the entity is always empty regardless of the column - so I decided to leave it as []
to include all fields.
now given the id "445499747977925765" I would like to retrieve the document;
so I tried
entity_id = 445499747977925765
# Define the filter condition
filter_expr = f"id == {entity_id}"
# Search with the filter condition
a = collection.search(
data=[],
anns_field="",
param={"filter": filter_expr},
limit=1
)
but the a
empty !
to give you a full picture I am populating the database using
from langchain.vectorstores import Milvus
vector_db = Milvus.from_documents(
deduplicated_documents,
SentenceTransformerEmbeddings(),
connection_args={"host": cfg.db.connection.host, "port": cfg.db.connection.port},
collection_name=cfg_data.target.collection.name,
drop_old=cfg_data.target.collection.renew
)
so I appreciate to hear how to retrial the document given ID
According to milvus documentation, the condition should be passed within the key
expr
. like this:but as you're not trying to conduct vector search, it's better to use the method
query
instead ofsearch
: