Handling LangChain Async Calls

1k views Asked by At

I am using Streamlit to build a Chat Interface with LangChain in the background. I have problems to properly use the astream_log function from langchain to generate output. My app looks like follows:

├─ utils
│  ├─ __init.py__
│  └─ chat.py
├─ app.py

In the app.py I define the st.chat_input and call a function form chat.py to generate a response.

Sequential code

This is the barebones of my app.py code when using sequential processing:

import streamlit as st
from utils.chat import generate_chat_response

# Initialize chat history
if "messages" not in st.session_state:
    st.session_state['messages'] = []

# Accept user input
if prompt := st.chat_input("Your message."):
    response = generate_chat_response(
        question=prompt,
        chat_history=st.session_state['messages']
    )
    st.write(response)

The chat.py file looks as follows (shortened to most important code). (My code is actually a custom chain with retrieval and different prompts)

from langchain.prompts import ChatPromptTemplate
from langchain.chains import LLMChain
from langchain.chat_models import ChatOpenAI

def create_chain():
    llm = ChatOpenAI()
    characteristics_prompt = ChatPromptTemplate.from_template(
                """
            Tell me a joke about {subject}.
                """
    )
    llm_chain= LLMChain(
                llm=llm, prompt=characteristics_prompt, output_key="characteristics"
    )
    return llm_chain

def generate_chat_response(prompt,chat_history):
    chain = create_chain()
    response = chain.invoke({'subject' : prompt})
    return response

Change to asynchronous

Now, I want to use chain.astream_log (docs,github) to access intermediate values in the chain (e.g. retrieved documents). Unfortuantely, the chain.*_log is only available as asynchronous version. The question is, how to change this code to make this work (i know I can't just swap out chain.invoke with chain.astream_log.

I already dug into many asynchronous tutorial but I can just can't seem to transfer the knowledge. I expect that I have to change from def generate_chat_response ... to async def generate_chat_response ... and call asyncio.run() somewhere.

One difficulty for me is that there is not much documentation on the async part for langchain even though it seems to be a integral part of it. Paired this with only a superficial understanding of async is making me fail at this.

Three questions:

  1. is it possible at all incorporate the asynchronous function without changing the whole script to asynchronous?
  2. What changes to my code do I have to do to make it work?
  3. Is this even the best way to deal with this situation? I took this approach from chat-langchain which uses fastapi to return the chat response.

I am kinda lost at this point and really hope someone can figure this out together with me.

P.S. please ignore that the code kinda pointless in the sense of being a good chatbot interface. For me it is most important that I can actually incorporate calling the astream_log to get a useful response.

0

There are 0 answers