Linked Questions

Popular Questions

I have a relatively simple FastAPI app that accepts a query and streams back the response from ChatGPT's API. ChatGPT is streaming back the result and I can see this being printed to console as it comes in.

What's not working is the StreamingResponse back via FastAPI. The response gets sent all together instead. I'm really at a loss as to why this isn't working.

Here is the FastAPI app code:

import os
import time

import openai

import fastapi
from fastapi import Depends, HTTPException, status, Request
from import HTTPBearer, HTTPAuthorizationCredentials
from fastapi.responses import StreamingResponse

auth_scheme = HTTPBearer()
app = fastapi.FastAPI()

openai.api_key = os.environ["OPENAI_API_KEY"]

def ask_statesman(query: str):
    #prompt = router(query)
    completion_reason = None
    response = ""
    while not completion_reason or completion_reason == "length":
        openai_stream = openai.ChatCompletion.create(
            messages=[{"role": "user", "content": query}],
        for line in openai_stream:
            completion_reason = line["choices"][0]["finish_reason"]
            if "content" in line["choices"][0].delta:
                current_response = line["choices"][0].delta.content
                yield current_response
async def request_handler(auth_key: str, query: str):
    if auth_key != "123":
        raise HTTPException(
            detail="Invalid authentication credentials",
            headers={"WWW-Authenticate": auth_scheme.scheme_name},
        stream_response = ask_statesman(query)
        return StreamingResponse(stream_response, media_type="text/plain")

if __name__ == "__main__":
    import uvicorn, host="", port=8000, debug=True, log_level="debug")

And here is the very simple file to test this:

import requests

query = "How tall is the Eiffel tower?"
url = "http://localhost:8000"
params = {"auth_key": "123", "query": query}

response =, params=params, stream=True)

for chunk in response.iter_lines():
    if chunk:

Related Questions