I have a program which makes uses of the torch.multiprocessing library and running it often results in OSError: Too many open files. The structure of the the program is as follows:
class Runner:
def __init__(self, config):
torch.multiprocessing.set_sharing_strategy(config.SHARING_STRATEGY)
# Initialisation
def do_some_processing(self):
torch.multiprocessing.set_sharing_strategy(config.SHARING_STRATEGY)
# Do some stuff
class MainClass:
def __init__(self):
self.config = {}
torch.multiprocessing.set_sharing_strategy(config.SHARING_STRATEGY)
# Do some stuff
def run(self):
torch.multiprocessing.set_sharing_strategy(config.SHARING_STRATEGY)
# Do some stuff
for i in range n_runners:
# Create runner and do something
# Collect results and terminate all runners
When the program is run, one creates a MainClass object and passes some params to the init function. These params are stored in the self.config object. Afterwards, run() is called on the object which in turn creates a number of runner objects, runs stuff using them and then gathers the results.
My issue is as follows: I want to be able to pass the desired setting for the torch multiprocessing sharing strategy to the constructor of the main class, this should then be stored in the config and the strategy should be set according to the parameter that was passed.
I have tried to set config.SHARING_STRATEGY = "file_system" and to set the sharing strategy as above (in all constructors and also at the beginning of every function), however, this doesn't seem to work since I still get OSError: Too many open files.
I was able to resolve this by setting the sharing strategy right at the top of the main .py file, so to have something like
import torch
torch.multiprocessing.set_sharing_strategy("file_system")
at the very top of the file and this works. However, if I do it this way I won't be able to set the strategy according to the config.SHARING_STRATEGY parameter.
Does anyone have an intuition of why this is or how I would be able to set the sharing strategy later in the program (i.e. in the run() function)?