Latest revision as of 20:27, 3 June 2022

See the Annotated SLURM Job Script page for explanation of the #SLURM directives.

Note: Replace all <VARIABLE> sections with your information.

Simple BLASTP Job

Run a blastp job with 4 threads against the nr database.

Download raw source of the [{{#fileLink: blastp.sh}} blastp.sh] file. {{#fileAnchor: blastp.sh}}

#!/bin/bash
#SBATCH --job-name=<JOBNAME>
#SBATCH --mail-user=<EMAIL>
#SBATCH --mail-type=FAIL,END
#SBATCH --output <blastp_%j.log>
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=8gb
#SBATCH --time=4:00:00
date;hostname;pwd

module load ncbi_blast

blastp -query query.fa -db nr -out output.txt -outfmt 6 -evalue 0.001

date

BLASTN Job Array

Generate input query files from a single fasta file

Create and change into input directory

mkdir input
cd input

Split the query file

faSplit sequence ../large.fasta 120 blast_query_

Note that the number is larger than the 100 item array listed below. That's because faSplit from the UCSC Genome Browser utilities will not split the input query file into exactly the number of chunks that were specified. Some experimentation may be required to arrive at a reasonable number of small query files to provide the highest throughput of the BLAST alignment project depending on the number and size of entries in the original fasta query file and the SLURM allocation of the account used.

Download raw source of the [{{#fileLink: blastp_array.sh}} blastp_array.sh] file. {{#fileAnchor: blastp_array.sh}}

#!/bin/bash
#SBATCH --job-name=<JOBNAME>
#SBATCH --mail-user=<EMAIL>
#SBATCH --mail-type=FAIL,END
#SBATCH --output <blastp_%j.log>
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=8gb
#SBATCH --time=4:00:00
#SBATCH --array=1-100
date;hostname;pwd

module load ncbi_blast
 
export INPUT_DIR="input"
export OUTPUT_DIR="output"
export LOG_DIR="logs"
mkdir -p ${OUTPUT_DIR} ${LOG_DIR}
 
RUN_ID=$(( $SLURM_ARRAY_TASK_ID + 1 ))
 
QUERY_FILE=$( ls ${INPUT_DIR} | sed -n ${RUN_ID}p )
QUERY_NAME="${QUERY_FILE%.*}"
 
QUERY="${INPUT_DIR}/${QUERY_FILE}"
OUTPUT="${OUTPUT_DIR}/${QUERY_NAME}.out"
 
echo -e "Command:\nblastn –query ${QUERY} –db nt –out ${OUTPUT} –evalue 0.001 –outfmt 6 –num_threads 8"
 
blastn -query ${QUERY} -db nt -out ${OUTPUT} -evalue 0.001 -outfmt 6 -num_threads 8
 
date

@@ Line 1: / Line 1: @@
-[[Category:Job Scripts]]
 [[Blast|Back to the BLAST page]]
@@ Line 10: / Line 9: @@
 Download raw source of the [{{#fileLink: blastp.sh}} blastp.sh] file.
-{{#fileAnchor: run.sh}}
+{{#fileAnchor: blastp.sh}}
 <source lang=make>
 #!/bin/bash
@@ Line 40: / Line 39: @@
 Download raw source of the [{{#fileLink: blastp_array.sh}} blastp_array.sh] file.
-{{#fileAnchor: run.sh}}
+{{#fileAnchor: blastp_array.sh}}
 <source lang=make>
 #!/bin/bash

Difference between revisions of "Blast Job Scripts"

Latest revision as of 20:27, 3 June 2022

Simple BLASTP Job

BLASTN Job Array

Navigation menu

Search