Difference between revisions of "Usearch"

From UFRC
Jump to navigation Jump to search
m (Text replace - "]] {| <!--Main settings - REQUIRED-->" to "]] {|<!--Main settings - REQUIRED-->")
m (Text replacement - "#uppercase" to "uc")
 
(4 intermediate revisions by 2 users not shown)
Line 5: Line 5:
 
|{{#vardefine:app|usearch}}
 
|{{#vardefine:app|usearch}}
 
|{{#vardefine:url|http://www.drive5.com/usearch/}}
 
|{{#vardefine:url|http://www.drive5.com/usearch/}}
<!--Compiler and MPI settings - OPTIONAL -->
 
 
 
 
|{{#vardefine:citation|1}} <!--Enable Reference/Citation section -->
 
|{{#vardefine:citation|1}} <!--Enable Reference/Citation section -->
 
|{{#vardefine:exe|}} <!--Present manual instructions for running the software -->
 
|{{#vardefine:exe|}} <!--Present manual instructions for running the software -->
Line 22: Line 19:
  
 
USEARCH is a unique high-throughput sequence analysis tool. It is a distributed as single binary program that implements a suite of algorithms comparable to BLASTN, BLASTP, BLASTX, BLASTCLUST, CD-HIT, CD-HIT-EST, CD-HIT-2D, CD-HIT-EST-2D, CD-HIT-OTU, CD-HIT-454, ChimeraSlayer, Perseus, RAPsearch and more. It supports a rich set of sequence matching options, including E-values, identity, coverage (fraction of query or target sequence covered by the alignment) and maximum gap length, and a range of output file formats including FASTA, BLAST-like, user-defined tabbed text and a native format designed for clustering applications. Supported alignment styles include local (gapped and ungapped), like BLAST, and global, which is most often used in clustering applications. User-settable parameters allow tuning of substitution scores, gap penalties and Karlin-Altschul statistics.
 
USEARCH is a unique high-throughput sequence analysis tool. It is a distributed as single binary program that implements a suite of algorithms comparable to BLASTN, BLASTP, BLASTX, BLASTCLUST, CD-HIT, CD-HIT-EST, CD-HIT-2D, CD-HIT-EST-2D, CD-HIT-OTU, CD-HIT-454, ChimeraSlayer, Perseus, RAPsearch and more. It supports a rich set of sequence matching options, including E-values, identity, coverage (fraction of query or target sequence covered by the alignment) and maximum gap length, and a range of output file formats including FASTA, BLAST-like, user-defined tabbed text and a native format designed for clustering applications. Supported alignment styles include local (gapped and ungapped), like BLAST, and global, which is most often used in clustering applications. User-settable parameters allow tuning of substitution scores, gap penalties and Karlin-Altschul statistics.
 
+
<!--Modules-->
==Available Versions==
 
* 5.0.151 and 5.1.221 (EL5).
 
* 5.2.32 (EL6).
 
Note: only 32-bit binaries are available limiting the memory use to less then 4GB. If a 64-bit binary is needed it will have to be purchased from the author.
 
<!-- -->
 
 
==Required Modules==
 
==Required Modules==
 
[[Modules|modules documentation]]
 
[[Modules|modules documentation]]
 
===Serial===
 
===Serial===
 
*{{#var:app}}
 
*{{#var:app}}
{{#if: {{#var: exe}}|==How To Run==
+
==System Variables==
 +
* HPC_{{uc:{{#var:app}}}}_DIR - installation directory
 +
<!--Additional-->
 +
{{#if: {{#var: exe}}|==Additional Information==
 
WRITE INSTRUCTIONS ON RUNNING THE ACTUAL BINARY|}}
 
WRITE INSTRUCTIONS ON RUNNING THE ACTUAL BINARY|}}
 
{{#if: {{#var: conf}}|==Configuration==
 
{{#if: {{#var: conf}}|==Configuration==
Line 39: Line 34:
 
See the [[{{PAGENAME}}_PBS]] page for {{#var: app}} PBS script examples.|}}
 
See the [[{{PAGENAME}}_PBS]] page for {{#var: app}} PBS script examples.|}}
 
{{#if: {{#var: policy}}|==Usage Policy==
 
{{#if: {{#var: policy}}|==Usage Policy==
Our license only allows the usage of a 32-bit binary for teaching and non-profit (academic research) purposes. If your use falls outside of these categories please procure a license from the author.
+
We have a 64-bit licensed USEARCH binary in the usearch/7.0.1001-64 module.
 
|}}
 
|}}
 
{{#if: {{#var: testing}}|==Performance==
 
{{#if: {{#var: testing}}|==Performance==
Line 69: Line 64:
 
</bibtex>-->
 
</bibtex>-->
 
|}}
 
|}}
 +
=Validation=
 +
* Validated 4/5/2018

Latest revision as of 21:29, 6 December 2019

Description

usearch website  

USEARCH is a unique high-throughput sequence analysis tool. It is a distributed as single binary program that implements a suite of algorithms comparable to BLASTN, BLASTP, BLASTX, BLASTCLUST, CD-HIT, CD-HIT-EST, CD-HIT-2D, CD-HIT-EST-2D, CD-HIT-OTU, CD-HIT-454, ChimeraSlayer, Perseus, RAPsearch and more. It supports a rich set of sequence matching options, including E-values, identity, coverage (fraction of query or target sequence covered by the alignment) and maximum gap length, and a range of output file formats including FASTA, BLAST-like, user-defined tabbed text and a native format designed for clustering applications. Supported alignment styles include local (gapped and ungapped), like BLAST, and global, which is most often used in clustering applications. User-settable parameters allow tuning of substitution scores, gap penalties and Karlin-Altschul statistics.

Required Modules

modules documentation

Serial

  • usearch

System Variables

  • HPC_USEARCH_DIR - installation directory


Usage Policy

We have a 64-bit licensed USEARCH binary in the usearch/7.0.1001-64 module.


Citation

If you publish research that uses usearch you have to cite it as follows:

Edgar, Robert C. - Search and clustering orders of magnitude faster than BLAST
Bioinformatics, 2010
Author : Edgar, Robert C.
Title : Search and clustering orders of magnitude faster than BLAST
Publication : Bioinformatics
Date : 2010

Validation

  • Validated 4/5/2018