I'm trying to download PDF files that are rendered in a browser (not shown as a popup or downloaded) using playwright
(Python). No URL is exposed, so you can't simply scrape a link and download it using requests.get("file_url")
.
I've tried:
async def main():
async with async_playwright() as p:
browser = await p.chromium.launch(headless=False)
page = await browser.newPage(acceptDownloads=True)
await page.goto("www.some_landing_page.com")
async with page.expect_download() as download_info:
await page.click("a") # selector to a pdf file
download = download_info.value
path = download.path()
I've also tried page.expect_popup()
with no luck either. My understanding is that this can't be done using pyppeteer
, but would welcome a solution this way as well, if possible.
For anyone with a similar problem, try using firefox or webkit instead of chromium for the browser. Provided a work-around for me.