how to work out lightning PyTorch and dgl DataLoader batch size set to None problem

35 views Asked by At

I am currently working on a model where I need to pass a specific batch size to my dgl.Dataloader but when used with Lightning PyTorch I get the following warning:

Trying to infer the batch_size from an ambiguous collection. The batch size we found is 2998. To avoid any miscalculations, use self.log(..., batch_size=batch_size).

And once looking at the DGL documentations they mentioned something as below but I still cannot find out what to do to make the Dataloader not return None for batch_size.

(BarclayII) PyTorch Lightning sometimes will recreate a DataLoader

from an existing # DataLoader with modifications to the original arguments. The arguments are retrieved # from the attributes with the same name, and because we change certain arguments # when calling super().init() (e.g. batch_size attribute is None even if the # batch_size argument is not, so the next DataLoader's batch_size argument will be # None), we cannot reinitialize the DataLoader with attributes from the previous # DataLoader directly.

A workaround is to check whether "collate_fn" appears in kwargs. If "collate_fn" # is indeed in kwargs and it's already a CollateWrapper object, we can assume that # the arguments come from a previously created DGL DataLoader, and directly initialize # the new DataLoader from kwargs without any changes.

0

There are 0 answers