Getting RateLimitError when using open-source LLMs

92 views Asked by Rail Suleymanov At 24 October 2023 at 12:20

I was going through some online examples on how to employ large language models for information retrieval and was about to explore open-source solutions for that objective. I found in some tutorials that HuggingFaceInstructEmbeddings are quite good for documents representation and the LLaMA-2 from Meta AI is sufficiently good for text generation. However, when I try to run very simple example I see the error message openai.error.RateLimitError which in my understanding doesn't have anything to do with non-OpenAI solutions. Any ideas what may be causing the error? I will soon acquire a paid subcription for OpenAI API anyway but it would be extremely useful for me to understand why such an error arises where one doesn't expect it.

P.S. I have to write lines 2 and 3 because for the same unknown reason it expects me to provide the key.

Brief description of my code snippet:

I provide a PDF and a question I want to be answered using contents of that document
The document is read, split, converted to vector representation using the provided embedding function and stored in the vector store
The question is converted into vector representation and most similar documents from the vector store are found
Found documents are used to generate answer to the question.

import sys
import os
os.environ['OPENAI_API_KEY'] = 'my_open_api_key'

from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.llms import Ollama
from langchain.agents.agent_toolkits import create_vectorstore_agent
from langchain.agents.agent_toolkits import VectorStoreInfo, VectorStoreToolkit
from langchain.embeddings import HuggingFaceInstructEmbeddings


rcts = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
embedding = HuggingFaceInstructEmbeddings(model_name='hkunlp/instructor-xl')


def init_vectorstore(pdf_fn):
    loader = PyPDFLoader(pdf_fn)
    data = loader.load()
    splits = rcts.split_documents(data)
    return Chroma.from_documents(documents=splits, embedding=embedding)


def main(llm, pdf_fn, question):
    vs = init_vectorstore(pdf_fn)
    vi = VectorStoreInfo(name='vsi', description='simple example', vectorstore=vs)
    toolkit = VectorStoreToolkit(vectorstore_info=vi)
    agent_executor = create_vectorstore_agent(llm=llm, toolkit=toolkit, verbose=False)
    agent_executor.run(question)
    print(response)


if __name__ == '__main__':
    llm = Ollama(model='llama2', temperature=1e-10)
    pdf_fn, question = sys.argv[1:3]
    main(llm, pdf_fn, question)

I won't paste the complete error trace as it's too big but eventually it looks like this:

Traceback (most recent call last):
  File "/Users/rsuleimanov/Documents/llm_deeds/cookbook/error_example.py", line 38, in <module>
    main(llm, pdf_fn, question)
  File "/Users/rsuleimanov/Documents/llm_deeds/cookbook/error_example.py", line 29, in main
    agent_executor.run(question)
...

  File "/Users/rsuleimanov/Documents/llm_deeds/langchainenv/lib/python3.9/site-packages/openai/api_requestor.py", line 687, in _interpret_response_line
    raise self.handle_error_response(
openai.error.RateLimitError: You exceeded your current quota, please check your plan and billing details.

Original Q&A

TechQA.

Getting RateLimitError when using open-source LLMs

There are 0 answers

Related Questions in PYTHON-3.X

Related Questions in OPENAI-API

Related Questions in PY-LANGCHAIN

Popular Questions

Popular Tags

Trending Questions