I'm thinking about using Milvus vector storage in my Flask based project and looking at the PyMilvus (Python SDK) documentation. I haven't found any information yet about:
- Is PyMilvus thread-safe?
- Is PyMilvus fork-safe?
- How does connection pooling work in the SDK?
Could you help me to sort it out?
The official documentation doesn't contain too much information.
Currently PyMilvus version(v2.3.x) doesn't provide a thread pool or connection pool. Basically, PyMilvus has a global object "connections" to maintain client-to-server connections.
User calls connections.connect() To create a connection:
This method has a parameter "alias", it is the name of the connection. The "connections" object internally maintains a map of name-to-connection. If you didn't provide the "alias", it will use "default" as the name of the connection.
When you declare a collection, there is a parameter "using" to specify a connection name. If you didn't provide the "using", it will use "default" connection. All the interfaces of this Collection will work via this connection.
The connection object is thread-safe, which means you can call the collection's interfaces from different threads. But the connection object cannot be shared by multiple sub-processes. So, if you fork a sub-process, you should ensure each subprocess creates its own connection by connections.connect().