I am using Dask in a cluster environment through Dask-gateway. I created the computation graph in a Jupyter Notebook. Within my delayed function, it calls multiple functions defined in several separate .py files. Currently, I am running into the error on the worker saying that these modules cannot be found. I guess this is because the workers have no access to those .py files. I am wondering how I can 'send' these .py files to workers?
Thanks.
yep exactly. all classes/functions/modules must be defined within the notebook, or must be importable on the worker using the same syntax as is called by the client.
You can either define your methods within the notebook, deploy the modules with pip or some other deployment mechanism, or use
client.upload_file
.From the dask docs: