Difference between revisions of "Submitting Array Jobs"
(Created page with "Back to SLURM Job Arrays ==Submitting array jobs== A job array can be submitted simply by adding #SBATCH --array=x-y to the job script where ''x'' and ''y'' are the arra...") |
|||
Line 1: | Line 1: | ||
− | Back to [[SLURM Job Arrays]] | + | Back to [[SLURM Job Arrays]] __NOTOC__ |
==Submitting array jobs== | ==Submitting array jobs== | ||
A job array can be submitted simply by adding | A job array can be submitted simply by adding |
Revision as of 15:36, 5 May 2023
Back to SLURM Job Arrays
Submitting array jobs
A job array can be submitted simply by adding
#SBATCH --array=x-y
to the job script where x and y are the array bounds. A job array can also be specified at the command line with
sbatch --array=x-y job_script.sbatch
A job array will then be created with a number of independent jobs a.k.a. array tasks that correspond to the defined array.
SLURM's job array handling is very versatile. Instead of providing a task range a comma-separated list of task numbers can be provided, for example, to rerun a few failed jobs from a previously completed job array as in
sbatch --array=4,8,15,16,23,42 job_script.sbatch
which can be used to quickly rerun the lost tasks from a previous job array for example. Command line options override options in the script, so those can be left unchanged.
Limiting the number of tasks that run at once
To throttle a job array by keeping only a certain number of tasks active at a time use the %N
suffix where N is the number of active tasks. For example
#SBATCH -a 1-200%5
will produce a 200 task job array with only 5 tasks active at any given time.
Note that while the symbol used is the % sign, this is the actual number of tasks to be submitted at once.
Using scontrol to modify throttling of running array jobs
If you want to change the number of simultaneous tasks of an active job, you can use scontrol:
scontrol update ArrayTaskThrottle=<count> JobId=<jobID>
eg
scontrol update ArrayTaskThrottle=50 JobId=12345
Set ArrayTaskThrottle=0 to eliminate any limit.
Naming output and error files
SLURM uses the %A and %a replacement strings for the master job ID and task ID, respectively.
For example:
#SBATCH --output=Array_test.%A_%a.out #SBATCH --error=Array_test.%A_%a.error
The error log is optional as both types of logs can be written to the 'output' log.
#SBATCH --output=Array_test.%A_%a.log
- Note
- if you only use '%A' in the log all array tasks will try to write to a single file. The performance of the run will approach zero asymptotically. Make sure to use both %A and %a in the log file name specification.