Unable to load JavaScript and got pyppeteer error from webpage with requests

Question

Unable to load JavaScript and got pyppeteer error from webpage with requests

229 views Asked by Fabix At 28 November 2023 at 00:18

I'm trying to scrape a webpage after login.

If I use only BeautifulSoup and requests I get

Please enable JavaScript to continue using this application.

So, I decided to use html_requests with the following code:

from requests_html import HTMLSession

session = HTMLSession()

session.get(url)
session.post(loginUrl, data = {"email":"[email protected]", "password": "Pass123"})


resp.html.render()

But I get the same error or:

pyppeteer.errors.PageError: net::ERR_SSL_VERSION_OR_CIPHER_MISMATCH

So I decided to use selenium, even though I really prefer to use request due to higher script speed.

When I use selenium, it works fine, but when I load the selenium's page source into BeautifulSoup, I again get the

Please enable JavaScript to continue using this application.

error page.

Why? On driver is loaded fine and I just parse the HTML page from selenium.

How can I fix both the requests_html and BeautifulSoup errors?

Original Q&A

There are 1 answers

**baduker** · Accepted Answer · 2023-11-30T11:57:38+00:00

You don't really need either pyppeteer or selenium. You can log in using a plain request and get all the data you want.

The key here is to get the accessToken via the Login endpoint and then use it in subsequent requests.

The API calls I'm making here are the meat of the page after logging in. The rest of the HTML is just eye-candy. The data coming from the API corresponds to what you see on the site:

As for the pyppeteer.errors.PageError: net::ERR_SSL_VERSION_OR_CIPHER_MISMATCH, this error is typically caused by an SSL/TLS handshake failure. The server you're trying to connect to may be using an outdated or unsupported SSL/TLS version or cipher suite.

You can read more about the error here.

TL;DR: There's not much you can do about it.

I'd recommend using my approach (no browser, just API calls).

Benefits of the following approach:

lightweight
relatively fast
no SSL errors
full data

Here's how you can get the sale data:

import requests
from dateutil.parser import parse

login_url = "https://api-it.saywow.me/it-it/api/Users/Login"
sales_url = "https://api-it.saywow.me/it-it/api/Booking/GetCanBookSaleEvents"
payload = {
    "email": "YOUR_EMAIL",
    "password": "YOUR_PASSWORD",
}


def format_date(date: str) -> str:
    return parse(date).strftime("%d %B")


def show_sales(sales_data: list) -> None:
    for sale in sales_data:
        event = sale["saleEvent"]["saleEventName"]
        address = sale["saleEvent"]["addressFull"]
        start_date = format_date(sale["saleEvent"]["startDate"])
        end_date = format_date(sale["saleEvent"]["endDate"])
        is_booked = sale["isBooked"]

        template = f"""
Event: {event}
Address: {address}
Dates: {start_date} - {end_date}
Booked: {"Yes!" if is_booked else "You can book this event!"}
"""
        print(template)


def main() -> None:
    with requests.Session() as session:
        response = session.post(login_url, json=payload)
        token = response.json()["data"]["accessToken"]
        sales = session.post(
            sales_url,
            headers={"Authorization": f"Bearer {token}"},
        )
        show_sales(sales.json()["data"])


if __name__ == "__main__":
    main()

If you plug in your registration email and a valid password, you should see this:

Event: HOUSE OF LUXURY
Address: Viale John Fitzgerald Kennedy 54, Napoli NA
Dates: 08 December - 17 December
Booked: You can book this event!


Event: Monot Archive Sale
Address: Via Orobia 11, Milano MI
Dates: 28 November - 06 December
Booked: You can book this event!

There's plenty more in the sales_data table, like location, phone numbers, etc.

Here's a sample:

...

"addressName": "Via Orobia",
"addressNumber": "11",
"addressCity": "Milano",
"addressProvince": "MI",
"addressZip": "20139",
"addressCountry": "IT",
"addressLat": 45.4426322,
"addressLon": 9.2056631,

...

TechQA.

Unable to load JavaScript and got pyppeteer error from webpage with requests

There are 1 answers

Related Questions in JAVASCRIPT

Related Questions in SELENIUM-WEBDRIVER

Related Questions in PYTHON-REQUESTS-HTML

Related Questions in PYPPETEER

Popular Questions

Popular Tags

Trending Questions