HTTP request with browser "redirects" request without 302, but with Python Request library it 404's

716 views Asked by At

I'm trying to fetch a page made with react with Python's request.get that returns 404.

        import requests               
        page=requests.get("https://example.com/foo", allow_redirects=True)
        print(page.status_code)

results in 404. I see that requests support HTTP 1.1 only.

With curl the URL returns 404 but then the server responds with a different page anyways. The server is using HTTP2. Here are some hints from curl -vv that seem relevant:

$ curl -v https://example.com/foo
* Trying 10.0.0.1
* TCP_NODELAY set
* Connected to example.com (10.0.0.1) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
[snip]
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
> GET /foo HTTP/2
> Host: example.com
> User-Agent: curl/7.61.1
> Accept: */*
[snip]
< HTTP/2 404 
< date: Fri, 08 Apr 2022 08:42:34 GMT
< content-type: text/html; charset=utf-8
< cache-control: public, max-age=0, s-maxage=300
< etag: W/"a568501bae2318d9d0ca13a89359638e"
< last-modified: Fri, 10 Sep 2021 17:30:40 UTC
< strict-transport-security: max-age=315360000; includeSubdomains; preload
< vary: Accept-Encoding
< x-content-type-options: nosniff
< cf-cache-status: MISS
[snip some cloudflare stuff]

Then finally followed by the content of https://example.com/bar,

The headers of the response seem to indicate it "offers" http 1.1, so how do I ask for it it with the request library?

Searching, I see httpx as a http2 compliant library but their examples have snippets of async but assuming background knowledge of the same. Do I have to use httpx for http2 on Python 3.8 or greater? Is there a way to do this without async/await?

1

There are 1 answers

0
Bastien B On

You can use httpx for http/2, you have a specific part of the doc explaining how to activate it, you need at least python 3.6, the doc suggest the use of async by it's examples.

pip install httpx[http2]

From the doc:

client = httpx.AsyncClient(http2=True)

but you can use it with the classic Client:

import httpx

client = httpx.Client(http2=True)

if __name__ == "__main__":
    resp = client.get('https://example.com/foo')
    print(resp.content)