I'm trying to schedule some batch processes in Python using schedule that contain multiprocessing. They keep on hanging so I've written a smaller test job to, well test. However I wrote it on my Mac and it works perfectly, but when I try to run it on my work's PC it hangs. When I just run the constituent parts they all run perfectly on both systems.

All Python 3.7, Mac was Sierra, PC is Windows 10 Pro

''' This is the test process to schedule (saved as mp_lib) '''

import pandas as pd
import numpy as np
from multiprocessing import Pool

def parallelise(df, func):
    df_split = np.array_split(df, 2)
    pool = Pool(2)
    df = pd.concat(pool.map(func, df_split))
    pool.close()
    pool.join()
    return df

def parallel_func(df):
    df['Three'] = df['One'] + df['Two']
    return df

def run_job():
    df = pd.DataFrame([[1,2],[3,4],[5,6],[7,8]], columns=['One', 'Two'])
    df2 = parallelise(df, parallel_func)
    print(df2)


''' This is the scheduler '''

import schedule
import time
import mp_lib as mpl

def main():
    schedule.every(10).seconds.do(mpl.run_job)
    while True:
        schedule.run_pending()
        time.sleep(1)

if __name__ == "__main__":
    main()

The expectation is for the console to output the rudimentary maths every 10 seconds which it does on the Mac. In Windows the console is silent however the Task Manager shows active Python sessions.

0 Answers