I want to create multiple folders from a Base folder and then run those multiple folders parallelly. It is working fine when the folders are already created. However, it is stuck in infinite loop when I create those folders and then run the multi processing in a single python code.
Working Code (When adress_a and address_b are already existing:
import os
from multiprocessing import Pool
Base_address='C:\\Users\\bappi\\Desktop\\Base_address'
address_a='C:\\Users\\bappi\\Desktop\\address_a'
address_b='C:\\Users\\bappi\\Desktop\\address_b'
Folders=['']*2
Folders[0]=address_a
Folders[1]=address_b
def call_exe(address):
os.chdir(address)
exe_file='Test.exe'
os.system(exe_file)
if __name__ == '__main__':
with Pool(2) as p:
p.map(call_exe,Folders)
However, when address_a and address_b non existing and I want to create it from base folder, then the following code is not working.
Not Working code: when i create a copy of adrress_a and address_b from base folder
import os
from multiprocessing import Pool
import shutil
Base_address='C:\\Users\\bappi\\Desktop\\Base_address'
address_a='C:\\Users\\bappi\\Desktop\\address_a'
address_b='C:\\Users\\bappi\\Desktop\\address_b'
shutil.copytree(Base_address, address_a) # successfully folder created
shutil.copytree(Base_address, address_b) # successfully folder created
Folders=['']*2
Folders[0]=address_a
Folders[1]=address_b
def call_exe(address):
os.chdir(address)
exe_file='Test.exe'
os.system(exe_file)
if __name__ == '__main__':
with Pool(2) as p:
p.map(call_exe,Folders) # This one stuck in infinite loop
If you hold your cursor over the tag
multiprocessingyou will see:When questions are posted with the tag
multiprocessingit is important to always specify the platform you are executing on as it can make a huge difference between your code working or not.Since you did not specify the platform I will, perhaps erroneously, assume that it is one that uses the spawn method rather than the fork method to create new processes. Even if that is not the case you may find some value in this answer for the future.
When the spawn method is used to create processes and you are using multiprocessing (a multiprocessing pool in this case), the child processes are created and initially have uninitialized memory. Then for each child process the Python interpreter is loaded and executed reading in the original source program. Therefore, every statement at global scope (import statements, function definitions, etc.) will be executed to initialize memory before your worker function
call_exeis called.So if you are creating a pool with N processes, the statements at global scope will be executed N times, once for each process. Some of these statements may not be required for the proper initialization required by the worker function, for example an import statement for a module/package not used by the worker function, but it may not cause undue harm if executed. But other unnecessary global statements might cause irreparable harm or add inefficiencies because they are wasteful of CPU and/or memory resource. Therefore, if you have anything at global scope that you do not want executed you must enclose such statements within the check
if __name__ == '__main__':, which will only evaluate as True for the initial, main process. You have at global scope the statements (among others):Each of these statements will be executed once for each process in your multiprocessing pool for a total of 2 times each since you have a pool of size 2. The initialization for the second pool process will by necessity create an exception since the directories will have already existed. If you are running Python 3.8 or greater you can specify the dir_exists_ok=True argument to allow for the directory already existsing. But even then while the first pool process is processing the directory in worker function
call_exethe second pool process might be modifying the directory with its call tocopytree. These are statements at global scope causing irreparable harm.In the code below I have moved all statements that are not required by
call_exeto work to theif __name__ == '__main__':block (some of these, such as the import statements or definitions for the folders could have been left outside the name check as in your original code and would cause no harm except waste a few CPU cycles in initializing your pool processes).I have also tried to follow the PEP 8 – Style Guide for Python Code in naming of variables, spacing, etc. I also found your use of the word "address" to refer to a folder/directory name to be a bit unusuall (
path,directory,directory_name,folderorfolder_namewould have all been more meaningful to someone reading the code).