I have code that creates a zip archive of files and streams it. The problem is, for large requests, this means there can be minutes of processing time before the data is streamed, making cancelling requests problematic, since the Python will continue to run. Ideally I would yield each compressed file, one a time through a generator function, and the user receive the whole zip archive still, thus making cancellation of requests more robust.
I have a minimal working example. If the function yields a single file (N=1), it works fine. If you want to yield 2 or more (N>=2), the zip is corrupt. Here is a minimal working example:
# test endpoint
from fastapi import APIRouter, Path
from numpy import random
import io, zipfile
from fastapi.responses import StreamingResponse
router = APIRouter(tags=["Make files"])
# make some fake data and zip, then yield
def make_data(N):
"""
Make fake data.
"""
CHUNK_SIZE = 1024*1024
for n in range(N):
content = random.random(100)
name = f'{n:02}.txt'
# Create new in-memory zip file for each file
s = io.BytesIO()
with zipfile.ZipFile(s, "w", compression=zipfile.ZIP_DEFLATED, compresslevel=2) as zf:
# Add file content to the in-memory zip file
zf.writestr(name, content)
# Seek to the beginning of the in-memory zip file
s.seek(0)
# Yield the content of the in-memory zip file for the current file
while chunk := s.read(CHUNK_SIZE):
yield chunk
# streamingresponse
def stream_data(N):
"""
Stream the files.
"""
return StreamingResponse(
make_data(N), media_type="application/zip",
headers={"Content-Disposition": f"attachment; filename=download.zip"})
# endpoint
@router.get("/{N}")
async def yield_files(
N: int = Path(..., decription="random files to make")):
return stream_data(N)
Is there a simple change that could be made to allow this script to work?
zip files can be streamed, but Python's
ZipFiledoes not appear to support streaming. You could perhaps link to or adapt the C code in zipflow to stream zip files.