Difference between revisions of "Submitting Array Jobs"
Line 1: | Line 1: | ||
Back to [[SLURM Job Arrays]] __NOTOC__ | Back to [[SLURM Job Arrays]] __NOTOC__ | ||
− | + | ||
A job array can be submitted simply by adding | A job array can be submitted simply by adding | ||
#SBATCH --array=x-y | #SBATCH --array=x-y | ||
Line 13: | Line 13: | ||
which can be used to quickly rerun the lost tasks from a previous job array for example. Command line options override options in the script, so those can be left unchanged. | which can be used to quickly rerun the lost tasks from a previous job array for example. Command line options override options in the script, so those can be left unchanged. | ||
− | + | ==Limiting the number of tasks that run at once == | |
To ''throttle'' a job array by keeping only a certain number of tasks active at a time use the <code>%N</code> suffix where ''N'' is the number of active tasks. For example | To ''throttle'' a job array by keeping only a certain number of tasks active at a time use the <code>%N</code> suffix where ''N'' is the number of active tasks. For example | ||
#SBATCH -a 1-200%5 | #SBATCH -a 1-200%5 | ||
Line 20: | Line 20: | ||
Note that while the symbol used is the % sign, this is the actual number of tasks to be submitted at once. | Note that while the symbol used is the % sign, this is the actual number of tasks to be submitted at once. | ||
− | + | ===Using scontrol to modify throttling of running array jobs=== | |
{{Note|'''Reducing''' the "ArrayTaskThrottle" count on a running job array will not affect the tasks that have already entered the "RUNNING" state. It will only prevent new tasks from starting until the number or running tasks drops below the new lower threshold.|reminder}} | {{Note|'''Reducing''' the "ArrayTaskThrottle" count on a running job array will not affect the tasks that have already entered the "RUNNING" state. It will only prevent new tasks from starting until the number or running tasks drops below the new lower threshold.|reminder}} | ||
If you want to change the number of simultaneous tasks of an active job, you can use scontrol: | If you want to change the number of simultaneous tasks of an active job, you can use scontrol: | ||
Line 35: | Line 35: | ||
Set ArrayTaskThrottle=0 to eliminate any limit. | Set ArrayTaskThrottle=0 to eliminate any limit. | ||
− | + | ==Naming output and error files== | |
SLURM uses the %A and %a replacement strings for the master job ID and task ID, respectively. | SLURM uses the %A and %a replacement strings for the master job ID and task ID, respectively. |
Revision as of 15:51, 5 May 2023
Back to SLURM Job Arrays
A job array can be submitted simply by adding
#SBATCH --array=x-y
to the job script where x and y are the array bounds. A job array can also be specified at the command line with
sbatch --array=x-y job_script.sbatch
A job array will then be created with a number of independent jobs a.k.a. array tasks that correspond to the defined array.
SLURM's job array handling is very versatile. Instead of providing a task range a comma-separated list of task numbers can be provided, for example, to rerun a few failed jobs from a previously completed job array as in
sbatch --array=4,8,15,16,23,42 job_script.sbatch
which can be used to quickly rerun the lost tasks from a previous job array for example. Command line options override options in the script, so those can be left unchanged.
Limiting the number of tasks that run at once
To throttle a job array by keeping only a certain number of tasks active at a time use the %N
suffix where N is the number of active tasks. For example
#SBATCH -a 1-200%5
will produce a 200 task job array with only 5 tasks active at any given time.
Note that while the symbol used is the % sign, this is the actual number of tasks to be submitted at once.
Using scontrol to modify throttling of running array jobs
If you want to change the number of simultaneous tasks of an active job, you can use scontrol:
scontrol update ArrayTaskThrottle=<count> JobId=<jobID> |
eg |
scontrol update ArrayTaskThrottle=50 JobId=12345 |
Set ArrayTaskThrottle=0 to eliminate any limit.
Naming output and error files
SLURM uses the %A and %a replacement strings for the master job ID and task ID, respectively.
For example:
#SBATCH --output=Array_test.%A_%a.out #SBATCH --error=Array_test.%A_%a.error
The error log is optional as both types of logs can be written to the 'output' log.
#SBATCH --output=Array_test.%A_%a.log
- Note
- if you only use '%A' in the log all array tasks will try to write to a single file. The performance of the run will approach zero asymptotically. Make sure to use both %A and %a in the log file name specification.