I need to speed up a somewhat slow Python calculation process. When I use my regular code it utilizes only one of my CPUs. I need this process to use all of my CPUs.

So I found that using ProcessPoolExecutor() from concurrent.futures module can do that for me.

Here is the example describing that (original is here: https://towardsdatascience.com/heres-how-you-can-get-a-2-6x-speed-up-on-your-data-pre-processing-with-python-847887e63be5 ):

A Python function is written to resize all images in a folder to size 600x600. So the base function is:

import glob
import os
import cv2

for image_filename in glob.glob("*.jpg"):
    img = cv2.imread(image_filename)
    img = cv2.resize(img, (600, 600))

Using ProcessPoolExecutor() and making it 6 time faster with 6 CPU cores machine, the code looks like this:

import glob
import os
import cv2
import concurrent.futures

def load_and_resize(image_filename):
    img = cv2.imread(image_filename)
    img = cv2.resize(img, (600, 600))

with concurrent.futures.ProcessPoolExecutor() as executor:
    image_files = glob.glob("*.jpg")
    executor.map(load_and_resize, image_files)

OK, that seems pretty straight forward for me.

Now how to apply the above for my case?

My setup is like this:

# basic function for performing the calculations
def slow_time_consuming_function(arg1, arg2, arg3, arg4):
    # do some slow calculations with arg1, arg2, arg3, arg4
    # return some float result

# list of arg1 values (usual length of 100-500) 
arg1_list = ['xx', 'xx1', 'xx2', ...]

# function for calculating the whole range of arg1 values
def multiple_arg1_values_calculation_function(arg1_list, arg2, arg3, arg4):
    # empty list for the results
    list_results = []

    # loop for calculating  the results
    for arg1 in arg1_list:
        list_results.append(slow_time_consuming_function(arg1, arg2, arg3, arg4))

    return list_results

My problem here is that I can't hard code all non arg1 arguments for the basic function (some of them are custom created Python objects, etc.)

So how can I transform my code to start using concurrent.futures.ProcessPoolExecutor() ?

0 Answers