Difference between revisions of "Usearch"

From UFRC
Jump to navigation Jump to search
m (Text replace - "{{App_Description|app={{#var:app}}|url={{#var:url}}}}|}}" to "{{App_Description|app={{#var:app}}|url={{#var:url}}|name={{#var:app}}}}|}} ")
m (Text replacement - "#uppercase" to "uc")
 
(12 intermediate revisions by 2 users not shown)
Line 2: Line 2:
 
__NOEDITSECTION__
 
__NOEDITSECTION__
 
[[Category:Software]]
 
[[Category:Software]]
<!-- ########  Template Configuration ######## -->
+
{|<!--Main settings - REQUIRED-->
<!--Edit definitions of the variables used in template calls
 
Required variables:
 
app - lowercase name of the application e.g. "amber"
 
url - url of the software page (project, company product, etc) - e.g. "http://ambermd.org/"
 
Optional variables:
 
INTEL - Version of the Intel Compiler e.g. "11.1"
 
MPI - MPI Implementation and version e.g. "openmpi/1.3.4"
 
-->
 
{|
 
<!--Main settings - REQUIRED-->
 
 
|{{#vardefine:app|usearch}}
 
|{{#vardefine:app|usearch}}
 
|{{#vardefine:url|http://www.drive5.com/usearch/}}
 
|{{#vardefine:url|http://www.drive5.com/usearch/}}
<!--Compiler and MPI settings - OPTIONAL -->
 
|{{#vardefine:intel|}} <!-- E.g. "11.1" -->
 
|{{#vardefine:mpi|}} <!-- E.g. "openmpi/1.3.4" -->
 
<!--Choose sections to enable - OPTIONAL-->
 
|{{#vardefine:mod|1}} <!--Present instructions for running the software with modules -->
 
 
|{{#vardefine:citation|1}} <!--Enable Reference/Citation section -->
 
|{{#vardefine:citation|1}} <!--Enable Reference/Citation section -->
 
|{{#vardefine:exe|}} <!--Present manual instructions for running the software -->
 
|{{#vardefine:exe|}} <!--Present manual instructions for running the software -->
Line 34: Line 19:
  
 
USEARCH is a unique high-throughput sequence analysis tool. It is a distributed as single binary program that implements a suite of algorithms comparable to BLASTN, BLASTP, BLASTX, BLASTCLUST, CD-HIT, CD-HIT-EST, CD-HIT-2D, CD-HIT-EST-2D, CD-HIT-OTU, CD-HIT-454, ChimeraSlayer, Perseus, RAPsearch and more. It supports a rich set of sequence matching options, including E-values, identity, coverage (fraction of query or target sequence covered by the alignment) and maximum gap length, and a range of output file formats including FASTA, BLAST-like, user-defined tabbed text and a native format designed for clustering applications. Supported alignment styles include local (gapped and ungapped), like BLAST, and global, which is most often used in clustering applications. User-settable parameters allow tuning of substitution scores, gap penalties and Karlin-Altschul statistics.
 
USEARCH is a unique high-throughput sequence analysis tool. It is a distributed as single binary program that implements a suite of algorithms comparable to BLASTN, BLASTP, BLASTX, BLASTCLUST, CD-HIT, CD-HIT-EST, CD-HIT-2D, CD-HIT-EST-2D, CD-HIT-OTU, CD-HIT-454, ChimeraSlayer, Perseus, RAPsearch and more. It supports a rich set of sequence matching options, including E-values, identity, coverage (fraction of query or target sequence covered by the alignment) and maximum gap length, and a range of output file formats including FASTA, BLAST-like, user-defined tabbed text and a native format designed for clustering applications. Supported alignment styles include local (gapped and ungapped), like BLAST, and global, which is most often used in clustering applications. User-settable parameters allow tuning of substitution scores, gap penalties and Karlin-Altschul statistics.
<!--Location-->
+
<!--Modules-->
{{App_Location|app={{#var:app}}|{{#var:ver}}}}
+
==Required Modules==
==Available Versions==
+
[[Modules|modules documentation]]
* 5.0.151 and 5.1.221 (EL5).
+
===Serial===
* 5.2.32 (EL6).
+
*{{#var:app}}
Note: only 32-bit binaries are available limiting the memory use to less then 4GB. If a 64-bit binary is needed it will have to be purchased from the author.
+
==System Variables==
<!-- -->
+
* HPC_{{uc:{{#var:app}}}}_DIR - installation directory
{{#if: {{#var: mod}}|==Running the application using modules==
+
<!--Additional-->
{{App_Module|app={{#var:app}}|intel={{#var:intel}}|mpi={{#var:mpi}}}}|}}
+
{{#if: {{#var: exe}}|==Additional Information==
{{#if: {{#var: exe}}|==How To Run==
 
 
WRITE INSTRUCTIONS ON RUNNING THE ACTUAL BINARY|}}
 
WRITE INSTRUCTIONS ON RUNNING THE ACTUAL BINARY|}}
 
{{#if: {{#var: conf}}|==Configuration==
 
{{#if: {{#var: conf}}|==Configuration==
Line 49: Line 33:
 
{{#if: {{#var: pbs}}|==PBS Script Examples==
 
{{#if: {{#var: pbs}}|==PBS Script Examples==
 
See the [[{{PAGENAME}}_PBS]] page for {{#var: app}} PBS script examples.|}}
 
See the [[{{PAGENAME}}_PBS]] page for {{#var: app}} PBS script examples.|}}
{{#if: {{#var: policy}}|==Usage policy==
+
{{#if: {{#var: policy}}|==Usage Policy==
Our license only allows the usage of a 32-bit binary for teaching and non-profit (academic research) purposes. If your use falls outside of these categories please procure a license from the author.
+
We have a 64-bit licensed USEARCH binary in the usearch/7.0.1001-64 module.
 
|}}
 
|}}
 
{{#if: {{#var: testing}}|==Performance==
 
{{#if: {{#var: testing}}|==Performance==
Line 80: Line 64:
 
</bibtex>-->
 
</bibtex>-->
 
|}}
 
|}}
 +
=Validation=
 +
* Validated 4/5/2018

Latest revision as of 21:29, 6 December 2019

Description

usearch website  

USEARCH is a unique high-throughput sequence analysis tool. It is a distributed as single binary program that implements a suite of algorithms comparable to BLASTN, BLASTP, BLASTX, BLASTCLUST, CD-HIT, CD-HIT-EST, CD-HIT-2D, CD-HIT-EST-2D, CD-HIT-OTU, CD-HIT-454, ChimeraSlayer, Perseus, RAPsearch and more. It supports a rich set of sequence matching options, including E-values, identity, coverage (fraction of query or target sequence covered by the alignment) and maximum gap length, and a range of output file formats including FASTA, BLAST-like, user-defined tabbed text and a native format designed for clustering applications. Supported alignment styles include local (gapped and ungapped), like BLAST, and global, which is most often used in clustering applications. User-settable parameters allow tuning of substitution scores, gap penalties and Karlin-Altschul statistics.

Required Modules

modules documentation

Serial

  • usearch

System Variables

  • HPC_USEARCH_DIR - installation directory


Usage Policy

We have a 64-bit licensed USEARCH binary in the usearch/7.0.1001-64 module.


Citation

If you publish research that uses usearch you have to cite it as follows:

Edgar, Robert C. - Search and clustering orders of magnitude faster than BLAST
Bioinformatics, 2010
Author : Edgar, Robert C.
Title : Search and clustering orders of magnitude faster than BLAST
Publication : Bioinformatics
Date : 2010

Validation

  • Validated 4/5/2018