Difference between revisions of "Blast Job Scripts"

Jump to navigation Jump to search
(One intermediate revision by one other user not shown)
Line 1: Line 1:
[[Category:Job Scripts]]
[[Blast|Back to the BLAST page]]
[[Blast|Back to the BLAST page]]
Line 10: Line 9:
Download raw source of the [{{#fileLink: blastp.sh}} blastp.sh] file.
Download raw source of the [{{#fileLink: blastp.sh}} blastp.sh] file.
{{#fileAnchor: run.sh}}
{{#fileAnchor: blastp.sh}}
<source lang=make>
<source lang=make>

Latest revision as of 20:27, 3 June 2022

Back to the BLAST page

See the Annotated SLURM Job Script page for explanation of the #SLURM directives.

Replace all <VARIABLE> sections with your information.

Simple BLASTP Job

Run a blastp job with 4 threads against the nr database.

Download raw source of the [{{#fileLink: blastp.sh}} blastp.sh] file. {{#fileAnchor: blastp.sh}}

#SBATCH --job-name=<JOBNAME>
#SBATCH --mail-user=<EMAIL>
#SBATCH --mail-type=FAIL,END
#SBATCH --output <blastp_%j.log>
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=8gb
#SBATCH --time=4:00:00

module load ncbi_blast

blastp -query query.fa -db nr -out output.txt -outfmt 6 -evalue 0.001


BLASTN Job Array

Generate input query files from a single fasta file
  • Create and change into input directory
mkdir input
cd input
  • Split the query file
faSplit sequence ../large.fasta 120 blast_query_

Note that the number is larger than the 100 item array listed below. That's because faSplit from the UCSC Genome Browser utilities will not split the input query file into exactly the number of chunks that were specified. Some experimentation may be required to arrive at a reasonable number of small query files to provide the highest throughput of the BLAST alignment project depending on the number and size of entries in the original fasta query file and the SLURM allocation of the account used.

Download raw source of the [{{#fileLink: blastp_array.sh}} blastp_array.sh] file. {{#fileAnchor: blastp_array.sh}}

#SBATCH --job-name=<JOBNAME>
#SBATCH --mail-user=<EMAIL>
#SBATCH --mail-type=FAIL,END
#SBATCH --output <blastp_%j.log>
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=8gb
#SBATCH --time=4:00:00
#SBATCH --array=1-100

module load ncbi_blast
export INPUT_DIR="input"
export OUTPUT_DIR="output"
export LOG_DIR="logs"
mkdir -p ${OUTPUT_DIR} ${LOG_DIR}
QUERY_FILE=$( ls ${INPUT_DIR} | sed -n ${RUN_ID}p )
echo -e "Command:\nblastnquery ${QUERY} –db ntout ${OUTPUT} –evalue 0.001 –outfmt 6 –num_threads 8"
blastn -query ${QUERY} -db nt -out ${OUTPUT} -evalue 0.001 -outfmt 6 -num_threads 8