langchain: how to use a custom deployed fastAPI embedding model locally?

38 views Asked by At

I want to build a retriever in Langchain and want to use an already deployed fastAPI embedding model. How could I do that?

from langchain_community.vectorstores import DocArrayInMemorySearch

embeddings_model = requests.post("http://internal-server/embeddings/")

db = DocArrayInMemorySearch.from_documents(chunked_docs, embeddings_model)
retriever = db.as_retriever()
1

There are 1 answers

0
Andrew Nguonly On

You can create a custom embeddings class that subclasses the BaseModel and Embeddings classes. Example:

from typing import List

import requests
from langchain_core.embeddings import Embeddings
from langchain_core.pydantic_v1 import BaseModel


class APIEmbeddings(BaseModel, Embeddings):
    """Calls an API to generate embeddings."""

    def embed_documents(self, texts: List[str]) -> List[List[float]]:
        # TODO: Call API to generate embeddings
        # requests.post("http://internal-server/embeddings/")
        pass

    def embed_query(self, text: str) -> List[float]:
        # TODO: Call API to generate embeddings
        # requests.post("http://internal-server/embeddings/")
        pass

embed_documents() and embed_query() are abstract methods in the Embeddings class and they must be implemented. The OllamaEmbeddings class is a simple example of how to create a custom embeddings class.

You can use the custom embeddings class just like any other embeddings class.

embeddings_model = APIEmbeddings()

db = DocArrayInMemorySearch.from_documents(chunked_docs, embeddings_model)
retriever = db.as_retriever()

References

  1. Embeddings (LangChain GitHub)
  2. Ollama (LangChain GitHub)