When i try to scrape this site site i run into an issue and i can't figure out what's wrong. i tried using Htmlsession but python told me to use AsyncHTMLSession because the former can't perform loops. when using AsyncHTMLSession i keep running into this problem.
url = "https://www.sec.gov/ix?doc=/Archives/edgar/data/0000789019/000095017023035122/msft-20230630.htm"
session = AsyncHTMLSession()
response = session.get(url)
await response.html.arender()
await session.close()
print(response.html)
print(response.html.html)
this is the error i get
AttributeError Traceback (most recent call last)
Cell In [12], line 4
2 session = AsyncHTMLSession()
3 response = session.get(url)
----> 4 await response.html.arender()
5 await session.close()
7 print(response.html)
AttributeError: '_asyncio.Future' object has no attribute 'html'
Please any help would be greatly appreciated.
I've added await to the render code. tried passing a sleep int in the render code, also adding a await asession.close() also yielded the same error code.
Use other URL to load the HTLM (not the Ajax-y one), for example:
Prints: