I am trying to get all the console messages given a website URL using pyppeteer. However, All I can get are the responses with the 200-status code. I am using a website which has 4xx and 5xx responses as well, but the script does not return them. I actually want to use a python script to extract all the js and ajax errors from a website. Is it possible using pyppeteer or possibly using some other library?
This is the code that I am using.
import asyncio
from pyppeteer import launch
async def on_console(msg):
print('PAGE LOG:', msg.text)
async def on_page_error(error):
print(error)
async def on_response(response):
print(f"{response.status} {response.url}")
async def on_request_failed(request):
failure = request.failure()
if failure:
print(failure.get('errorText', 'Unknown error'), request.url)
else:
print('Request failed without error information', request.url)
async def main():
browser = await launch()
page = await browser.newPage()
page.on('console', lambda msg: asyncio.ensure_future(on_console(msg)))
page.on('pageerror', lambda error: asyncio.ensure_future(on_page_error(error)))
page.on('response', lambda response: asyncio.ensure_future(on_response(response)))
page.on('requestfailed', lambda request: asyncio.ensure_future(on_request_failed(request)))
try:
await page.goto('https://toursthingstodo.com')
await page.waitFor(5000)
except Exception as e:
print(f"An error occurred: {e}")
await browser.close()
asyncio.run(main())
This is the response that I get.
200 https://toursthingstodo.com/
200 https://toursthingstodo.com/_next/static/css/cfeb154ff119552e.css
200 https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js?client=ca-pub-3751418851639209
However, these are the errors that I have on my website. enter image description here