I am downloading many large files using aiohttp, by many I mean hundreds of files, many hundreds of mebibytes or even gibibytes in size.
I am using aiohttp with PyQt6. Sometimes the connections become stale for whatever reason, and the execution just freezes, download speed literally drops to zero, no exceptions are thrown, it doesn't timeout, it hangs and no progress is made forever after...
The progress bar just freezes there.
I don't really know how to describe this and Google searching proves futile, the problem is very simple, the program just waits for data that will never arrive, the connections have hanged, in task manager, disk usage is 0, network usage is 0, CPU utilization is 0...
Minimal example of my code:
import aiofiles
import asyncio
from aiohttp import ClientSession, ClientTimeout
from PyQt6.QtCore import QThread
from PyQt6.QtWidgets import QApplication
from qasync import QEventLoop
from tqdm import tqdm
CHUNK = 524288
class Downloader(QThread):
def __init__(self, url: str, filepath: str, links: int = 32) -> None:
super().__init__()
self.url = url
self.filepath = filepath
self.links = links
async def preprocess(self, session: ClientSession) -> None:
resp = await session.head(self.url)
self.total = int(resp.headers["Content-Length"])
self.progress = tqdm(
total=self.total, unit_scale=True, unit_divisor=1024, unit="B"
)
async def start_download(self) -> None:
async with ClientSession(
headers={
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:123.0) Gecko/20100101 Firefox/123.0"
},
timeout=ClientTimeout(sock_read=5)
) as session:
await self.preprocess(session)
self.chunk = self.total // self.links
ends = range(0, self.total, self.chunk)[: self.links]
self.ranges = [
*((start, end - 1) for start, end in zip(ends, ends[1:])),
(ends[-1], self.total - 1),
]
await asyncio.gather(
*(self.download_worker(i, session) for i in range(self.links)),
return_exceptions=True
)
async def download_worker(self, index: int, session: ClientSession) -> None:
async with aiofiles.open(self.filepath, "wb") as file:
start, end = self.ranges[index]
await file.seek(start)
async with session.get(
url=self.url,
headers={"Range": f"bytes={start}-{end}"},
) as resp:
resp.content.read
async for chunk in resp.content.iter_chunked(CHUNK):
self.progress.update(len(chunk))
self.update.emit()
await file.write(chunk)
def run(self) -> None:
loop = QEventLoop(self)
asyncio.set_event_loop(loop)
loop.run_until_complete(self.start_download())
app = QApplication([])
down = Downloader("http://ipv4.download.thinkbroadband.com/100MB.zip", "D:/speedtest/100MB.zip")
down.run()
The above code will run successfully usually, and so far this address given have never caused the freezing issue, I am downloading NexusMods mod files and sometimes NexusMods links will cause the freezing, but NexusMods download links will expire so I will not use them as examples.
Running the code normally will not produce any errors, the download will be successful, and I have found a way to 100% cause the freezing to occur without fail, while the download is running, cut your network connection using whatever means that is convenient, and the download will stop, and the progressbar will just freeze there, no exception is thrown.
I use devcon disable *dev_8168* to cut the network while it is downloading and the freezing issue is successfully reproduced 100% of the time, progress doesn't resume after I run devcon enable *dev_8168* to reenable the adapter.
Now if I change the corresponding code to this:
class Downloader:
def __init__(self, url: str, filepath: str, links: int = 32) -> None:
self.url = url
self.filepath = filepath
self.links = links
def run(self) -> None:
asyncio.run(self.start_download())
It will work normally, but as soon as I cut the network, exceptions are thrown, like so:
-------------------------------------------------------------------------- | 35.4M/100M [00:04<00:07, 8.93MB/s]
ContentLengthError Traceback (most recent call last)
File C:\Python310\lib\site-packages\aiohttp\client_proto.py:83, in ResponseHandler.connection_lost(self, exc)
82 try:
---> 83 uncompleted = self._parser.feed_eof()
84 except Exception as e:
File C:\Python310\lib\site-packages\aiohttp\_http_parser.pyx:510, in aiohttp._http_parser.HttpParser.feed_eof()
ContentLengthError: 400, message:
Not enough data for satisfy content length header.
The above exception was the direct cause of the following exception:
ClientPayloadError Traceback (most recent call last)
...
ClientPayloadError: Response payload is not completed
This is the intended behavior as it immediately stop execution instead of hanging forever, and the exceptions can be caught. I want to make it so that if no data is received for 3 seconds, immediately close the connection and response and stop waiting, raise a TimeoutError, so that TimeoutError can be caught and then download can be resumed after new connections are made.
But it just freezes and waits forever, no exception is thrown. How can I make it throw exceptions while running in QThread and QEventLoop?