Difference between revisions of "SLURM Job Arrays"
(Created page with "Category:SLURM ==Introduction== To submit a number of identical jobs without having drive the submission with an external script use the SLURM's feature of ''array jobs''....") |
|||
Line 22: | Line 22: | ||
Note that while the symbol used is the % sign, this is the actual number of tasks to be submitted at once. | Note that while the symbol used is the % sign, this is the actual number of tasks to be submitted at once. | ||
+ | |||
+ | ===Naming output and error files=== | ||
+ | |||
+ | SLURM uses the %A and %a replacement strings for the master job ID and task ID, respectively. | ||
+ | |||
+ | For example: | ||
+ | #SBATCH --output=Array_test.%A_%a.out | ||
+ | #SBATCH --error=Array_test.%A_%a.error | ||
==Using the array ID Index== | ==Using the array ID Index== |
Revision as of 13:59, 22 June 2016
Introduction
To submit a number of identical jobs without having drive the submission with an external script use the SLURM's feature of array jobs.
Submitting array jobs
A job array can be submitted simply by adding
#SBATCH --array=x-y
to the job script where x and y are the array bounds. A job array can also be specified at the command line with
sbatch --array=x-y job_script.sbatch
A job array will then be created with a number of tasks that correspond to the specified array size.
SLURM's job array handling is very versatile. Instead of providing a task range a comma-separated list of task numbers can be provided, for example, to rerun a few failed jobs from a previously completed job array as in
sbatch --array=4,8,15,16,23,42 job_script.sbatch
which can be used to quickly rerun the lost tasks from a previous job array for example. Command line options override options in the script, so those can be left unchanged.
Limiting the number of tasks that run at once
To throttle a job array by keeping only a certain number of tasks active at a time use the %N
suffix where N is the number of active tasks. For example
#SBATCH -t 1-200%5
will produce a 200 task job array with only 5 tasks active at any given time.
Note that while the symbol used is the % sign, this is the actual number of tasks to be submitted at once.
Naming output and error files
SLURM uses the %A and %a replacement strings for the master job ID and task ID, respectively.
For example:
#SBATCH --output=Array_test.%A_%a.out #SBATCH --error=Array_test.%A_%a.error
Using the array ID Index
SLURM will provide a $SLURM_ARRAY_TASK_ID variable to each task. It can be used inside the job script to handle input and output files for that task.
For instance, for a 100-task job array the input files can be named seq_1.fa, seq_2.fa and so on through seq_100.fa. In a job script for a blastn job they can be referenced as blastn -query seq_${SLURM_ARRAY_TASK_ID}.fa. The output files can be handled in the same way.
One common application of array jobs is to run many input files. While it is easy if the files are numbered as in the example above, this is not needed. If for example you have a folder of 100 files that end in .txt, you can use a combination of ls, head and tail to get the name of the file for each task:
file=`ls *.txt | head -n $SLURM_ARRAY_TASK_ID | tail -n 1` myscript -in $file
Deleting job arrays and tasks
To delete all of the tasks of an array job, use scancel
with the job ID:
scancel 292441
To delete a single task, add the task ID:
scancel 292441_5