Slurm error " Allocation requested cores/tasks must be in quarter increments "

28 views Asked by At

Hi I am using the bridges2 supercomputer in PSC for running jobs, when I am trying to submit the job using the script below, I am getting the error

sbatch: error: Allocation requested cores/tasks must be in quarter increments of EM node resources (24, 48, 72, 96)

sbatch: error: Batch job submission failed: Access/permission denied

#!/bin/bash
#SBATCH -N 1
#SBATCH -p EM
#SBATCH --job-name=Job_3
#SBATCH -t 5-00:00:00
#SBATCH --mail-user [email protected]
#SBATCH --mail-type FAIL
#SBATCH --error=/jet/home/xyz/sdn_result/run1/err/Job_3.err 
#SBATCH --output=/jet/home/xyz/sdn_result/run1/out/Job_3.out
#SBATCH --ntasks=1
cd /jet/home/xyz/sdn_result/run1/application1/512; /jet/home/xyz/install/codes-swm/bin/model-net-mpi-replay --sync=1 --workloa
d_type=online --workload_conf_file=/jet/home/xyz/sdn_result/workconf/application1-512.all --alloc_file=/jet/home/xyz/sdn_resul
t/allocfiles/512-single.alloc --lp-io-dir=/jet/home/xyz/sdn_result/run1/application1/512/lpio-sdn -- /jet/home/xyz/sdn_result/
netconf/512-sdn.conf &> /jet/home/xyz/sdn_result/run1/application1/512/output-sdn; cd /jet/home/xyz/sdn_result/run1

Please kindly let me know how can I resolve the error.

2

There are 2 answers

0
AndyT On

The use of the EM partition on Bridges-2 is well documented in the PSC documentation at:

https://www.psc.edu/resources/bridges-2/user-guide/

(under Partitions in The TOC, select "For EM allocations"). This includes instructions on job constraints:

Jobs in the EM partition

  • run on Bridges-2’s EM nodes, which have 4TB of memory and 96 cores per node
  • can use at most one full EM node must specify the number of cores to use
  • must use a multiple of 24 cores. A job can request 24, 48, 72 or 96 cores.

They also have example job submission scripts you can follow. In your case, you need to change the -n 1 option to an option that is either 24, 48, 72 or 96.

I would suggest reading the documentation carefully for HPC systems you are using as systems at large centres such as PSC are usually very well described and have all the information you need to be able to use them.

2
damienfrancois On

This is not a standard Slurm error, but rather looks like a cli_filter implemented at your site to enforce some policies, among which it appears from the message that the number of cores per tasks must by one of 24, 48, 72, or 96.

So you can try adding

#SBATCH --cpus-per-task=24

to your submission script.