Pytorch 1.13 dataloader is significantly faster than Pytorch 2.0.1

112 views Asked by Milad Sikaroudi At 03 November 2023 at 15:30

I've noticed that PyTorch 2.0.1 DataLoader is significantly slower than PyTorch 1.13 DataLoader, especially when the number of workers is set to something other than 0. I've done some research and found that this is due to a change in the way that PyTorch handles multiprocessing in version 2.0.1. In PyTorch 1.13, the DataLoader uses a separate process for each worker. In PyTorch 2.0.1, the DataLoader uses a thread pool to manage the workers.

I'm using a simple DataLoader, but I need to stick to PyTorch 2.0.1 for other reasons. I'm looking for a workaround to speed up my DataLoader.

Steps to reproduce:

Load a dataset using PyTorch 1.13 DataLoader with the following settings: num_workers: 32 pin_memory: True Time the data loading process. Expected behavior:

The data loading process should be faster with PyTorch 2.0.1 DataLoader.

Actual behavior:

The data loading process is significantly slower with PyTorch 2.0.1 DataLoader.

Environment:

PyTorch version: 1.13, 2.0.1 Python version: 3.9 Operating system: Ubuntu 20.04 Question:

Is there a workaround to speed up the PyTorch 2.0.1 DataLoader?

Additional notes:

I've tried reducing the number of workers, but this doesn't significantly improve the performance. I've also tried using a smaller batch size, but this also doesn't significantly improve the performance. I appreciate any help you can provide.

Original Q&A

TechQA.

Pytorch 1.13 dataloader is significantly faster than Pytorch 2.0.1

There are 0 answers

Related Questions in PYTHON

Related Questions in PYTORCH

Related Questions in DATALOADER

Popular Questions

Popular Tags

Trending Questions