Difference between revisions of "Blast"

From UFRC
Jump to navigation Jump to search
 
(17 intermediate revisions by 2 users not shown)
Line 1: Line 1:
__NOTOC__
+
__NOTOC____NOEDITSECTION__
__NOEDITSECTION__
+
[[Category:Software]][[Category:Biology]][[Category:Genomics]]
[[Category:Software]][[Category:Bioinformatics]]
 
<!-- ########  Template Configuration ######## -->
 
<!--Edit definitions of the variables used in template calls
 
Required variables:
 
app - lowercase name of the application e.g. "amber"
 
url - url of the software page (project, company product, etc) - e.g. "http://ambermd.org/"
 
Optional variables:
 
INTEL - Version of the Intel Compiler e.g. "11.1"
 
MPI - MPI Implementation and version e.g. "openmpi/1.3.4"
 
-->
 
 
{|
 
{|
 
<!--Main settings - REQUIRED-->
 
<!--Main settings - REQUIRED-->
 
|{{#vardefine:app|blast}}
 
|{{#vardefine:app|blast}}
 
|{{#vardefine:url|http://www.ncbi.nlm.nih.gov/books/NBK1763/}}
 
|{{#vardefine:url|http://www.ncbi.nlm.nih.gov/books/NBK1763/}}
<!--Compiler and MPI settings - OPTIONAL -->
+
|{{#vardefine:exe|1}} <!--Present manual instructions for running the software -->
|{{#vardefine:intel|}} <!-- E.g. "11.1" -->
 
|{{#vardefine:mpi|}} <!-- E.g. "openmpi/1.3.4" -->
 
<!--Choose sections to enable - OPTIONAL-->
 
|{{#vardefine:mod|1}} <!--Present instructions for running the software with modules -->
 
|{{#vardefine:exe|}} <!--Present manual instructions for running the software -->
 
 
|{{#vardefine:conf|}} <!--Enable config wiki page link - {{#vardefine:conf|1}} = ON/conf|}} = OFF-->
 
|{{#vardefine:conf|}} <!--Enable config wiki page link - {{#vardefine:conf|1}} = ON/conf|}} = OFF-->
|{{#vardefine:pbs|}} <!--Enable PBS script wiki page link-->
+
|{{#vardefine:job|1}} <!--Enable PBS script wiki page link-->
 
|{{#vardefine:policy|}} <!--Enable policy section -->
 
|{{#vardefine:policy|}} <!--Enable policy section -->
|{{#vardefine:testing|}} <!--Enable performance testing/profiling section -->
+
|{{#vardefine:testing|1}} <!--Enable performance testing/profiling section -->
 
|{{#vardefine:faq|}} <!--Enable FAQ section -->
 
|{{#vardefine:faq|}} <!--Enable FAQ section -->
 
|{{#vardefine:citation|}} <!--Enable Reference/Citation section -->
 
|{{#vardefine:citation|}} <!--Enable Reference/Citation section -->
Line 31: Line 16:
 
<!--Description-->
 
<!--Description-->
 
{{#if: {{#var: url}}|
 
{{#if: {{#var: url}}|
{{App_Description|app={{#var:app}}|url={{#var:url}}}}|}}
+
{{App_Description|app={{#var:app}}|url={{#var:url}}|name={{#var:app}}}}|}}
 +
 
 
The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. BLAST can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families. See [http://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=ProgSelectionGuide Blast Program Selection Guide] and the main blast website for more details.
 
The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. BLAST can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families. See [http://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=ProgSelectionGuide Blast Program Selection Guide] and the main blast website for more details.
<!--Location-->
+
<!--Modules-->
==Available Versions==
+
==Environment Modules==
* 2.2.24 - the last NCBI BLAST version that featured the "megablast" and the "blastall" programs. If you are using an older pipeline that requires those programs please use the ncbi_blast/2.2.24 module.
+
Run <code>module spider {{#var:app}}</code> to find out what environment modules are available for this application.
* 2.2.25 (default) - the current version of NCBI BLAST.
+
==System Variables==
<!-- -->
+
* HPC_{{uc:{{#var:app}}}}_DIR - installation directory
{{#if: {{#var: mod}}|==Running the application using modules==
+
* BLASTDB - location of the blast databases.
To use {{#var:app}} with the environment modules system at HPC the following commands are available:
+
<!--Additional-->
 
+
{{#if: {{#var: exe}}|==Additional Information==
Get module information for {{lc: {{PAGENAME}}}}:  
+
We provide both NCBI BLAST+ and legacy blast binaries through the same modules.
$module spider ncbi_{{#var:app}}
 
{{#if: {{#var:intel}}|Load Intel compiler: {{#tag:pre|$module load intel/{{#var:intel}}}}|}}{{#if: {{#var:mpi}}|Load MPI implementation: {{#tag:pre|$module load {{#var:mpi}}}}|}}
 
Load the application module:
 
$module load ncbi_{{#var:app}}
 
 
 
The modulefile for this software adds the directory with executable files to the shell execution PATH and sets the following environment variables:
 
 
 
HPC_{{uc:{{#var:app}}}}_DIR - directory where {{#var:app}} is located.
 
 
|}}
 
|}}
 
==BLAST databases==
 
==BLAST databases==
Follow the link to see the Information on the [[BLASTDB|UF HPC provided BLAST databases]].
+
Follow the link to see the Information on the [[BLASTDB|UFRC provided BLAST databases]].
{{#if: {{#var: exe}}|==Manual execution instructions==
 
WRITE INSTRUCTIONS ON RUNNING THE APP WITHOUT MODULES HERE|}}
 
 
{{#if: {{#var: conf}}|==Configuration==
 
{{#if: {{#var: conf}}|==Configuration==
 
See the [[{{PAGENAME}}_Configuration]] page for {{#var: app}} configuration details.|}}
 
See the [[{{PAGENAME}}_Configuration]] page for {{#var: app}} configuration details.|}}
{{#if: {{#var: pbs}}|==PBS Script Examples==
+
{{#if: {{#var: job}}|==Job Script Examples==
See the [[{{PAGENAME}}_PBS]] page for {{#var: app}} PBS script examples.|}}
+
See the [[{{PAGENAME}}_Job_Scripts]] page for {{#var: app}} Job script examples.|}}
{{#if: {{#var: policy}}|==Usage policy==
+
{{#if: {{#var: policy}}|==Usage Policy==
 
WRITE USAGE POLICY HERE (perhaps templates for a couple of main licensing schemes can be used)|}}
 
WRITE USAGE POLICY HERE (perhaps templates for a couple of main licensing schemes can be used)|}}
 
{{#if: {{#var: testing}}|==Performance==
 
{{#if: {{#var: testing}}|==Performance==
WRITE PERFORMANCE TESTING RESULTS HERE|}}
+
Performance comparison for the NCBI distributed binary 'blastn' vs binaries built by us from source using Intel's 2013 compiler or Open64 compiler. The open64 build behaves as a single-threaded build. The test data is a set of 2011229 nucleotide sequences used for the database and its subset of 1000 sequences is used as a query file.
 +
 
 +
[[file:2015-07-20-blastn-binary_intel_open64.png]]
 +
|}}
 
{{#if: {{#var: faq}}|==FAQ==
 
{{#if: {{#var: faq}}|==FAQ==
 
*'''Q:''' **'''A:'''|}}
 
*'''Q:''' **'''A:'''|}}

Latest revision as of 18:17, 12 August 2022

Description

blast website  

The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. BLAST can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families. See Blast Program Selection Guide and the main blast website for more details.

Environment Modules

Run module spider blast to find out what environment modules are available for this application.

System Variables

  • HPC_BLAST_DIR - installation directory
  • BLASTDB - location of the blast databases.

Additional Information

We provide both NCBI BLAST+ and legacy blast binaries through the same modules.

BLAST databases

Follow the link to see the Information on the UFRC provided BLAST databases.

Job Script Examples

See the Blast_Job_Scripts page for blast Job script examples.

Performance

Performance comparison for the NCBI distributed binary 'blastn' vs binaries built by us from source using Intel's 2013 compiler or Open64 compiler. The open64 build behaves as a single-threaded build. The test data is a set of 2011229 nucleotide sequences used for the database and its subset of 1000 sequences is used as a query file.

2015-07-20-blastn-binary intel open64.png