Difference between revisions of "Annotated SLURM Script"

From UFRC
Jump to navigation Jump to search
m (fixed analysis spelling)
 
(12 intermediate revisions by 4 users not shown)
Line 1: Line 1:
[[Category:SLURM]]
+
[[Category:Scheduler]]
{{HPG2}}
+
This is a walk-through for a basic SLURM scheduler job script for a common case of a multi-threaded analysis. If the program you run is single-threaded (can use only one CPU core) then only use '--ntasks=1' line for the cpu request instead of all three listed lines. Annotations are marked with bullet points. You can click on the link below to download the raw job script file without the annotation. Values in brackets are placeholders. You need to replace them with your own values. E.g. Change '<job name>' to something like 'blast_proj22'. We will write additional documentation on more complex job layouts for MPI jobs and other situations when a simple number of processor cores is not sufficient.
This is a walk-through for a basic SLURM scheduler job script. Annotations are marked with bullet points. You can click on the link below to download the raw job script file without the annotation. Values in brackets are placeholders. You need to replace them with your own values. E.g. Change '<job name>' to something like 'blast_proj22'. We will write additional documentation on more complex job layouts for MPI jobs and other situations when a simple number of processor cores is not sufficient.
 
  
Download raw source of the [{{#fileLink: run.sh}} run.sh] file.
+
{|cellspacing=30
* Set the shell to use
+
|-style="vertical-align:top;"
{{#fileAnchor: run.sh}}
+
|style="width: 50%"|
<source lang=make>
+
;Set the shell to use
 +
<pre>
 
#!/bin/bash
 
#!/bin/bash
</source>
+
</pre>
 
;Common arguments
 
;Common arguments
 
* Name the job to make it easier to see in the job queue
 
* Name the job to make it easier to see in the job queue
{{#fileAnchor: run.sh}}
+
<pre>
<source lang=make>
 
 
#SBATCH --job-name=<JOBNAME>
 
#SBATCH --job-name=<JOBNAME>
</source>
+
</pre>
 
;Email
 
;Email
 
:Your email address to use for all batch system communications
 
:Your email address to use for all batch system communications
{{#fileAnchor: run.sh}}
+
<pre>
<source lang=make>
+
#SBATCH --mail-user=<EMAIL>
##SBATCH --mail-user=<EMAIL>
+
#SBATCH --mail-user=<EMAIL-ONE>,<EMAIL-TWO>
</source>
+
</pre>
 
;What emails to send
 
;What emails to send
 
:NONE - no emails
 
:NONE - no emails
 
:ALL - all emails
 
:ALL - all emails
 
:END,FAIL - only email if the job fails and email the summary at the end of the job
 
:END,FAIL - only email if the job fails and email the summary at the end of the job
{{#fileAnchor: run.sh}}
+
<pre>
<source lang=make>
 
 
#SBATCH --mail-type=FAIL,END
 
#SBATCH --mail-type=FAIL,END
</source>
+
</pre>
 
;Standard Output and Error log files
 
;Standard Output and Error log files
 
:Use file patterns  
 
:Use file patterns  
 
:: %j - job id
 
:: %j - job id
 
:: %A-%a - Array job id (A) and task id (a)
 
:: %A-%a - Array job id (A) and task id (a)
{{#fileAnchor: run.sh}}
+
:: You can also use --error for a separate stderr log
<source lang=make>
+
<pre>
 
#SBATCH --output <my_job-%j.out>
 
#SBATCH --output <my_job-%j.out>
#SBATCH --error <my_job-%j.err>
+
</pre>
</source>
+
;Number of nodes to use. For all non-MPI jobs this number will be equal to '1'
;Number of nodes to use
+
<pre>
{{#fileAnchor: run.sh}}
 
<source lang=make>
 
 
#SBATCH --nodes=1
 
#SBATCH --nodes=1
</source>
+
</pre>
;Number of tasks (usually translate to processor cores) to use
+
;Number of tasks. For all non-MPI jobs this number will be equal to '1'
{{#fileAnchor: run.sh}}
+
<pre>
<source lang=make>
 
 
#SBATCH --ntasks=1
 
#SBATCH --ntasks=1
</source>
+
</pre>
 +
;Number of CPU cores to use. This number must match the argument used for the program  you run.
 +
<pre>
 +
#SBATCH --cpus-per-task=4
 +
</pre>
 +
||
 
;Total memory limit for the job. Default is 2 gigabytes, but units can be specified with mb or gb for Megabytes or Gigabytes.
 
;Total memory limit for the job. Default is 2 gigabytes, but units can be specified with mb or gb for Megabytes or Gigabytes.
{{#fileAnchor: run.sh}}
+
<pre>
<source lang=make>
 
 
#SBATCH --mem=4gb
 
#SBATCH --mem=4gb
</source>
+
</pre>
 
;Job run time in [DAYS]:HOURS:MINUTES:SECONDS
 
;Job run time in [DAYS]:HOURS:MINUTES:SECONDS
 
:[DAYS] are optional, use when it is convenient
 
:[DAYS] are optional, use when it is convenient
{{#fileAnchor: run.sh}}
+
<pre>
<source lang=make>
 
 
#SBATCH --time=72:00:00
 
#SBATCH --time=72:00:00
</source>
+
</pre>
 
;Optional:
 
;Optional:
 
:A group to use if you belong to multiple groups. Otherwise, do not use.
 
:A group to use if you belong to multiple groups. Otherwise, do not use.
{{#fileAnchor: run.sh}}
+
<pre>
<source lang=make>
 
 
#SBATCH --account=<GROUP>
 
#SBATCH --account=<GROUP>
</source>
+
</pre>
:A job array, which will create many jobs (called array tasks) different only in the '<code>$SLURM_ARRAY_TASK_ID</code>' variable, similar to [[Torque_Job_Arrays]] on HiPerGator 1
+
:A job array, which will create many jobs (called array tasks) different only in the '<code>$SLURM_ARRAY_TASK_ID</code>' variable
{{#fileAnchor: run.sh}}
+
<pre>
<source lang=make>
 
 
#SBATCH --array=<BEGIN-END>
 
#SBATCH --array=<BEGIN-END>
</source>
+
</pre>
 
;Example of five tasks
 
;Example of five tasks
 
:<nowiki>#</nowiki>SBATCH --array=1-5
 
:<nowiki>#</nowiki>SBATCH --array=1-5
 
;END OF PBS SETTINGS:
 
 
----
 
----
 
;Recommended convenient shell code to put into your job script
 
;Recommended convenient shell code to put into your job script
 
* Add host, time, and directory name for later troubleshooting
 
* Add host, time, and directory name for later troubleshooting
{{#fileAnchor: run.sh}}
+
<pre>
<source lang=make>
 
 
date;hostname;pwd
 
date;hostname;pwd
</source>
+
</pre>
 
Below is the shell script part - the commands you will run to analyze your data. The following is an example.
 
Below is the shell script part - the commands you will run to analyze your data. The following is an example.
  
 
* Load the software you need
 
* Load the software you need
{{#fileAnchor: run.sh}}
+
<pre>
<source lang=make>
 
 
module load ncbi_blast
 
module load ncbi_blast
 
+
</pre>
</source>
 
 
* Run the program
 
* Run the program
{{#fileAnchor: run.sh}}
+
<pre>
<source lang=make>
+
blastn -db nt -query input.fa -outfmt 6 -out results.xml --num_threads 4
blastn -db nt -query input.fa -outfmt 6 -out results.xml
 
  
 
date
 
date
</source>
+
</pre>
 +
|}

Latest revision as of 20:49, 13 May 2024

This is a walk-through for a basic SLURM scheduler job script for a common case of a multi-threaded analysis. If the program you run is single-threaded (can use only one CPU core) then only use '--ntasks=1' line for the cpu request instead of all three listed lines. Annotations are marked with bullet points. You can click on the link below to download the raw job script file without the annotation. Values in brackets are placeholders. You need to replace them with your own values. E.g. Change '<job name>' to something like 'blast_proj22'. We will write additional documentation on more complex job layouts for MPI jobs and other situations when a simple number of processor cores is not sufficient.

Set the shell to use
#!/bin/bash
Common arguments
  • Name the job to make it easier to see in the job queue
#SBATCH --job-name=<JOBNAME>
Email
Your email address to use for all batch system communications
#SBATCH --mail-user=<EMAIL>
#SBATCH --mail-user=<EMAIL-ONE>,<EMAIL-TWO>
What emails to send
NONE - no emails
ALL - all emails
END,FAIL - only email if the job fails and email the summary at the end of the job
#SBATCH --mail-type=FAIL,END
Standard Output and Error log files
Use file patterns
%j - job id
%A-%a - Array job id (A) and task id (a)
You can also use --error for a separate stderr log
#SBATCH --output <my_job-%j.out>
Number of nodes to use. For all non-MPI jobs this number will be equal to '1'
#SBATCH --nodes=1
Number of tasks. For all non-MPI jobs this number will be equal to '1'
#SBATCH --ntasks=1
Number of CPU cores to use. This number must match the argument used for the program you run.
#SBATCH --cpus-per-task=4
Total memory limit for the job. Default is 2 gigabytes, but units can be specified with mb or gb for Megabytes or Gigabytes.
#SBATCH --mem=4gb
Job run time in [DAYS]
HOURS:MINUTES:SECONDS
[DAYS] are optional, use when it is convenient
#SBATCH --time=72:00:00
Optional
A group to use if you belong to multiple groups. Otherwise, do not use.
#SBATCH --account=<GROUP>
A job array, which will create many jobs (called array tasks) different only in the '$SLURM_ARRAY_TASK_ID' variable
#SBATCH --array=<BEGIN-END>
Example of five tasks
#SBATCH --array=1-5

Recommended convenient shell code to put into your job script
  • Add host, time, and directory name for later troubleshooting
date;hostname;pwd

Below is the shell script part - the commands you will run to analyze your data. The following is an example.

  • Load the software you need
module load ncbi_blast
  • Run the program
blastn -db nt -query input.fa -outfmt 6 -out results.xml --num_threads 4

date