Dumping embeddings in FAISS DB in langchain causing RAM to explode

Question

Dumping embeddings in FAISS DB in langchain causing RAM to explode

677 views Asked by Asim At 15 October 2023 at 05:42

I have a list of texts stored and am loading them with pickle, the texts are around 400K paragraphs and when I load them all in RAM to call FAISS DB, the memory explodes. Is there a way to do it incrementally? Given below is my current minimal working code:

import os
import pickle
from dotenv import load_dotenv
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS

load_dotenv()
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_KEY")
    
    def get_all_docs_from_chunks():
        "creates 1 list of all the chunks (RAM exploding here)" 
        all_docs = []
        chunk_num = 0
        chunks_files = sorted([f for f in os.listdir("chunks") if f.endswith(".pkl")], key=lambda x: int(x.split(".")[0]))
        
        for chunk_file in tqdm(chunks_files, desc="Loading Chunks"):
            with open(f'chunks/{chunk_file}', 'rb') as file:
                all_docs.extend(pickle.load(file))
        return all_docs


if __name__ == '__main__':
    all_split_docs = get_all_docs_from_chunks() # code stuck here and eventually stops at 3000/400K iteration
    print("--got all docs--")
    embeddings = OpenAIEmbeddings()

    filename = "faiss_openai_embeddings"
    print("-- making embeddings --")
    db = FAISS.from_documents(all_split_docs, embeddings)
    db.save_local(filename)
    print("-- embeddings saved --")

Any help will be highly appreciated, I am ok with getting 100 chunks, making their embeddings and then further updating the index but can't find langchain FAISS documentation that does this.

Original Q&A

There are 1 answers

**Rishab Jain** · Answer 1 · 2024-01-24T13:44:46+00:00

Rishab Jain On 24 January 2024 at 13:44

theres a logger command in embeddings function, if you go and comment that off. execution will run in seconds

TechQA.

Dumping embeddings in FAISS DB in langchain causing RAM to explode

There are 1 answers

Related Questions in PYTHON-3.X

Related Questions in OPENAI-API

Related Questions in LANGCHAIN

Related Questions in FAISS

Popular Questions

Popular Tags

Trending Questions