I was going through some online examples on how to employ large language models for information retrieval and was about to explore open-source solutions for that objective. I found in some tutorials that HuggingFaceInstructEmbeddings
are quite good for documents representation and the LLaMA-2
from Meta AI is sufficiently good for text generation. However, when I try to run very simple example I see the error message openai.error.RateLimitError
which in my understanding doesn't have anything to do with non-OpenAI solutions. Any ideas what may be causing the error? I will soon acquire a paid subcription for OpenAI API anyway but it would be extremely useful for me to understand why such an error arises where one doesn't expect it.
P.S. I have to write lines 2 and 3 because for the same unknown reason it expects me to provide the key.
Brief description of my code snippet:
- I provide a PDF and a question I want to be answered using contents of that document
- The document is read, split, converted to vector representation using the provided embedding function and stored in the vector store
- The question is converted into vector representation and most similar documents from the vector store are found
- Found documents are used to generate answer to the question.
import sys
import os
os.environ['OPENAI_API_KEY'] = 'my_open_api_key'
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.llms import Ollama
from langchain.agents.agent_toolkits import create_vectorstore_agent
from langchain.agents.agent_toolkits import VectorStoreInfo, VectorStoreToolkit
from langchain.embeddings import HuggingFaceInstructEmbeddings
rcts = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
embedding = HuggingFaceInstructEmbeddings(model_name='hkunlp/instructor-xl')
def init_vectorstore(pdf_fn):
loader = PyPDFLoader(pdf_fn)
data = loader.load()
splits = rcts.split_documents(data)
return Chroma.from_documents(documents=splits, embedding=embedding)
def main(llm, pdf_fn, question):
vs = init_vectorstore(pdf_fn)
vi = VectorStoreInfo(name='vsi', description='simple example', vectorstore=vs)
toolkit = VectorStoreToolkit(vectorstore_info=vi)
agent_executor = create_vectorstore_agent(llm=llm, toolkit=toolkit, verbose=False)
agent_executor.run(question)
print(response)
if __name__ == '__main__':
llm = Ollama(model='llama2', temperature=1e-10)
pdf_fn, question = sys.argv[1:3]
main(llm, pdf_fn, question)
I won't paste the complete error trace as it's too big but eventually it looks like this:
Traceback (most recent call last):
File "/Users/rsuleimanov/Documents/llm_deeds/cookbook/error_example.py", line 38, in <module>
main(llm, pdf_fn, question)
File "/Users/rsuleimanov/Documents/llm_deeds/cookbook/error_example.py", line 29, in main
agent_executor.run(question)
...
File "/Users/rsuleimanov/Documents/llm_deeds/langchainenv/lib/python3.9/site-packages/openai/api_requestor.py", line 687, in _interpret_response_line
raise self.handle_error_response(
openai.error.RateLimitError: You exceeded your current quota, please check your plan and billing details.