I am using Pyppeteer to get web content, and it runs normally with FastAPI on this machine. However, when placed in a Docker container, it will get stuck in the newPage function. Below is my code.
main.py
import uvicorn
from fastapi import FastAPI
from pyppeteer import launch
app = FastAPI()
async def test():
browser = await launch(
args=[
"--no-sandbox",
"--disable-gpu",
"--disable-dev-shm-usage",
"--disable-setuid-sandbox",
]
)
page = await browser.newPage()
await page.goto("https://example.com")
await page.waitForSelector(".invoice-box")
bodyHandle = await page.querySelector("body")
body = await bodyHandle.boundingBox()
pageWidth = round(body["width"])
pageHeight = round(body["height"])
await page.setViewport({"width": pageWidth, "height": pageHeight})
await page.screenshot({"path": "example.png", "fullPage": True})
await browser.close()
@app.get("/test")
async def main():
await test()
return {"message": "ok"}
Due to project requirements, I chose this image.
Dockerfile
FROM nikolaik/python-nodejs:python3.11-nodejs18-slim
WORKDIR /code
COPY entrypoint.sh requirements.txt /code/
RUN apt-get update \
&& apt-get install -y libpango-1.0-0 libpangoft2-1.0-0 \
&& pip install --no-cache-dir --upgrade -r requirements.txt \
&& apt-get update \
&& apt-get install -yq gconf-service libasound2 libatk1.0-0 libc6 libcairo2 libcups2 libdbus-1-3 \
libexpat1 libfontconfig1 libgcc1 libgconf-2-4 libgdk-pixbuf2.0-0 libglib2.0-0 libgtk-3-0 libnspr4 \
libpango-1.0-0 libpangocairo-1.0-0 libstdc++6 libx11-6 libx11-xcb1 libxcb1 libxcomposite1 \
libxcursor1 libxdamage1 libxext6 libxfixes3 libxi6 libxrandr2 libxrender1 libxss1 libxtst6 \
ca-certificates fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils wget \
xvfb x11vnc x11-xkb-utils xfonts-100dpi xfonts-75dpi xfonts-scalable x11-apps \
&& chmod +x entrypoint-dev.sh
COPY app /code/app
ENTRYPOINT ["./entrypoint.sh"]
entrypoint.sh
#!/bin/bash
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
When you try to launch Pyppeteer the first time it tries to install suitable chromium, if not found in that cotainer (as they mentioned in their docs usage section).
If you want to avoid this, add this command after all
apt-getcommands:However, I will strongly recommend not use Pyppeteer inside celery or an other process, not use inside web api.