I am using a job array to process a large number of files, and I am able to pass a pointer from my array to the specific data file to be processed in the job script, but I want to also pass the specific SLURM job ID to the script and I can't seem to find the correct syntax to do so.
My array script currently looks like this:
#!/bin/bash
# ============================================
#SBATCH --job-name=sortdata
...
#SBATCH --output=down1count/sort_%A_%a.txt
#SBATCH --array=0-99
# ============================================
SIZE=30
INDEX_FILE="down1list.txt"
IDXZERO=$(( SLURM_ARRAY_TASK_ID * SIZE ))
IDXBEG=$(( IDXZERO + 1 ))
IDXEND=$(( IDXBEG + SIZE - 1 ))
for IDX in $(seq $IDXBEG $IDXEND); do
DATA=$(sed -n ${IDX}p $INDEX_FILE)
sortfile1.bash $DATA
done
where down1list.txt
is just a list of the files in the directory created by ls down1/ >> down1list.txt
.
The relevant part of my job script sortfile1.bash
looks like this:
#!/bin/bash
for file in "down1/$@"; do
gunzip $file
###do some more stuff with the file####
done
What I would like to do is utilize my cluster's larger file system storage but it can only be accessed through my ${SLURM_JOB_ID}
. Then I would mv
the file before I unzip it in the above code. I've looked at a bunch of different questions and answers on this site and I can't seem to find anything that covers the syntax I am missing.
I believe by using $@
I ought to be able to access the ${SLURM_JOB_ID}
but I can't figure out how to add it correctly to the sortfile1.bash $DATA
line or how I would call it in my sortfile1.bash
code. I tried just adding it directly like this: sortfile1.bash $DATA %A_%a
but that doesn't seem to work.
The
${SLURM_JOB_ID}
environment variable should be visible from all programs that are part of the job. So you should be able to simply use it directly in the code ofsortfile1.bash
.Should that be not the case, the usual approach would be to pass the variable as the first argument and use the
shift
keyword to skip it once its value has be stored in another variable, like this:and call it like this in the submission script:
After
shift
is called,$@
will hold the list of arguments except for the first one, each being "shifted"$2
->$1
,$3
->$2
, etc.