Exceeding LLM's maximum context length even using llama_index PromptHelper

1.1k views Asked by Jorge Ros At 13 October 2023 at 11:15

I've been struggling with llama_index's PromptHelper and can't find a solution anywhere on the Internet.

But first let me talk about my use case:

I'm trying to use Azure OpenAI's GPT-3.5 model to ask the model to make a summary of comments posted by users in an Instagram post, passing in the prompt as a system message all the comments and then asking a question like: "What's the general sentiment in the comments?".

The problem here is that there are so many comments and in many publications I exceed gpt-35-turbo-16k's 16384 tokens maximum context length. Trying to solve this issue I've been working with llama_index's PromptHelper that, if I'm not mistaken helps divide the prompt in chunks in this kind of situations. The problem is that I keep getting the same error no matter in how many ways I change PromptHelper's parameters:

InvalidRequestError: This model's maximum context length is 16384 tokens. However, your messages resulted in 22272 tokens. Please reduce the length of the messages.

I'm pretty sure I'm messing it up in something in my code but can't find where, and llama_index's documentation is not helping me much.

Thanks in advance for any help.

Here is my code, just in case someone has any idea of what I'm doing wrong:

from llama_index.llms import AzureOpenAI
from llama_index.llms.base import ChatMessage
from llama_index.chat_engine import SimpleChatEngine
from llama_index import ServiceContext, PromptHelper
from llama_index.text_splitter import TokenTextSplitter
from llama_index.node_parser import SimpleNodeParser

import pandas as pd
import glob
import os

csv_files = glob.glob('data/csv/*.csv')
df = pd.read_csv(
    csv_files[1],
    sep='|'
)
context_comments = df['COMENTARIO'].to_list()

context_message = f"""Estos son los comentarios que los usuarios han dejado en una publicación de una empresa de supermercados en diferentes redes sociales: \n \
                                        {context_comments}. \n \
                                        Se te van a hacer preguntas sobre los comentarios de la misma, a las que deberás responder con precisión, \
                                        como si estuvieras redactando un informe para los responsables de la publicación, que quieren saber la acogida que ha tenido."""
prefix_messages = [ChatMessage(content=context_message, role="system")]

# Define the LLM
llm = AzureOpenAI(
    model=MODEL_NAME,
    engine=DEPLOYMENT_NAME,
    api_key=AZURE_OPENAI_API_KEY,
    api_base=AZURE_BASE_URL,
    api_type="azure",
    api_version="2023-05-15"
)

node_parser = SimpleNodeParser.from_defaults(
  text_splitter=TokenTextSplitter(chunk_size=512, chunk_overlap=20)
)

# Define prompt helper
prompt_helper = PromptHelper(
  context_window=16384,
  num_output=1500,
  chunk_overlap_ratio=0.2,
  separator="\n"
)

# Create the service context
service_context = ServiceContext.from_defaults(
    llm=llm,
    prompt_helper=prompt_helper,
    node_parser=node_parser,
    embed_model=None,
    chunk_size=512,
    context_window=16384,
    num_output=1500
)

# Use SimpleChatEngine to mix it all
chat_engine = SimpleChatEngine.from_defaults(
    service_context=service_context,
    verbose=True,
    prefix_messages=prefix_messages
)

response = chat_engine.chat("Cual es el sentimiento general de los comentarios de la publicacion?")

print(response)

It throws the same error as mentioned above.

Original Q&A

TechQA.

Exceeding LLM's maximum context length even using llama_index PromptHelper

There are 0 answers

Related Questions in PROMPT

Related Questions in LANGCHAIN

Related Questions in LARGE-LANGUAGE-MODEL

Related Questions in LLAMA-INDEX

Popular Questions

Popular Tags

Trending Questions