Dask-gateway - sending self-defined python files to worker

155 views Asked by At

I am using Dask in a cluster environment through Dask-gateway. I created the computation graph in a Jupyter Notebook. Within my delayed function, it calls multiple functions defined in several separate .py files. Currently, I am running into the error on the worker saying that these modules cannot be found. I guess this is because the workers have no access to those .py files. I am wondering how I can 'send' these .py files to workers?

Thanks.

1

There are 1 answers

0
Michael Delgado On

yep exactly. all classes/functions/modules must be defined within the notebook, or must be importable on the worker using the same syntax as is called by the client.

You can either define your methods within the notebook, deploy the modules with pip or some other deployment mechanism, or use client.upload_file.

From the dask docs:

upload_file(filename, **kwargs)

Upload local package to workers

This sends a local file up to all worker nodes. This file is placed into a temporary directory on Python’s system path so any .py, .egg or .zip files will be importable.

client.upload_file('mylibrary.egg')  
from mylibrary import myfunc  
L = client.map(myfunc, seq)