By my reading of Dask-Jobqueue (https://jobqueue.dask.org/en/latest/), and by testing on our SLURM cluster, it seems when you set
cluster.scale(n), and create
client = Client(cluster), none of your jobs are able to start until all
n of your jobs are able to start.
Suppose you have 999 jobs to run, and a cluster with 100 nodes or slots; worse yet, suppose other people share the cluster, and maybe some of them have long-running jobs. Admins sometimes need to do maintenance on some of the nodes, so they add and remove nodes. You never know how much parallelism you'll be able to get. You want the cluster scheduler to simply take 999 jobs (in slurm, these would be submitted via
sbatch), run them in any order on any available nodes, store results in a shared directory, and have a dependent job (in slurm, that would be
sbatch --dependency=) process the shared directory after all 999 jobs completed. Is this possible with DASK somehow?
It seems a fundamental limitation of the architecture, that all the jobs are expected to run in parallel, and the user must specify the degree of parallelism.