Difference between revisions of "USEARCH"

From UFRC
Jump to navigation Jump to search
m (Text replacement - "#uppercase" to "uc")
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[Category:Software]][[Category:biology]][[Category:bioinformatics]][[Category:alignment]][[Category:sequence analysis]]
+
__NOTOC__
{|<!--CONFIGURATION: REQUIRED-->
+
__NOEDITSECTION__
 +
[[Category:Software]][[Category:Biology]][[Category:Alignment]][[Category:Sequencing]]
 +
{|<!--Main settings - REQUIRED-->
 
|{{#vardefine:app|usearch}}
 
|{{#vardefine:app|usearch}}
 
|{{#vardefine:url|http://www.drive5.com/usearch/}}
 
|{{#vardefine:url|http://www.drive5.com/usearch/}}
<!--CONFIGURATION: OPTIONAL (|1}} means it's ON)-->
+
|{{#vardefine:citation|1}} <!--Enable Reference/Citation section -->
|{{#vardefine:conf|}}           <!--CONFIGURATION-->
+
|{{#vardefine:exe|}} <!--Present manual instructions for running the software -->
|{{#vardefine:exe|}}           <!--ADDITIONAL INFO-->
+
|{{#vardefine:conf|}} <!--Enable config wiki page link - {{#vardefine:conf|1}} = ON/conf|}} = OFF-->
|{{#vardefine:job|}}           <!--JOB SCRIPTS-->
+
|{{#vardefine:pbs|}} <!--Enable PBS script wiki page link-->
|{{#vardefine:policy|}}         <!--POLICY-->
+
|{{#vardefine:policy|1}} <!--Enable policy section -->
|{{#vardefine:testing|}}       <!--PROFILING-->
+
|{{#vardefine:testing|}} <!--Enable performance testing/profiling section -->
|{{#vardefine:faq|}}             <!--FAQ-->
+
|{{#vardefine:faq|}} <!--Enable FAQ section -->
|{{#vardefine:citation|}}       <!--CITATION-->
 
|{{#vardefine:installation|}} <!--INSTALLATION-->
 
 
|}
 
|}
<!--BODY-->
+
<!-- ########  Template Body ######## -->
 
<!--Description-->
 
<!--Description-->
 
{{#if: {{#var: url}}|
 
{{#if: {{#var: url}}|
 
{{App_Description|app={{#var:app}}|url={{#var:url}}|name={{#var:app}}}}|}}
 
{{App_Description|app={{#var:app}}|url={{#var:url}}|name={{#var:app}}}}|}}
  
USEARCH is a unique sequence analysis tool with thousands of users world-wide. USEARCH offers search and clustering algorithms that are often orders of magnitude faster than BLAST.
+
USEARCH is a unique high-throughput sequence analysis tool. It is a distributed as single binary program that implements a suite of algorithms comparable to BLASTN, BLASTP, BLASTX, BLASTCLUST, CD-HIT, CD-HIT-EST, CD-HIT-2D, CD-HIT-EST-2D, CD-HIT-OTU, CD-HIT-454, ChimeraSlayer, Perseus, RAPsearch and more. It supports a rich set of sequence matching options, including E-values, identity, coverage (fraction of query or target sequence covered by the alignment) and maximum gap length, and a range of output file formats including FASTA, BLAST-like, user-defined tabbed text and a native format designed for clustering applications. Supported alignment styles include local (gapped and ungapped), like BLAST, and global, which is most often used in clustering applications. User-settable parameters allow tuning of substitution scores, gap penalties and Karlin-Altschul statistics.
USEARCH combines many different algorithms into a single package with outstanding documentation and support. This cuts your learning curve, reduces the number of steps you need to take for a given task, and slashes compute times. USEARCH will encourage you to explore your data, enabling new insights and suggesting new analyses that you might not have tried with slower tools.  
 
 
 
 
<!--Modules-->
 
<!--Modules-->
 
==Environment Modules==
 
==Environment Modules==
Line 27: Line 25:
 
* HPC_{{uc:{{#var:app}}}}_DIR - installation directory
 
* HPC_{{uc:{{#var:app}}}}_DIR - installation directory
 
* HPC_{{uc:{{#var:app}}}}_BIN - executable directory
 
* HPC_{{uc:{{#var:app}}}}_BIN - executable directory
 
+
<!--Additional-->
<!--Configuration-->
+
{{#if: {{#var: exe}}|==Additional Information==
 +
WRITE INSTRUCTIONS ON RUNNING THE ACTUAL BINARY|}}
 
{{#if: {{#var: conf}}|==Configuration==
 
{{#if: {{#var: conf}}|==Configuration==
See the [[{{PAGENAME}}_Configuration]] page for {{#var: app}} configuration details.
+
See the [[{{PAGENAME}}_Configuration]] page for {{#var: app}} configuration details.|}}
|}}
+
{{#if: {{#var: pbs}}|==PBS Script Examples==
<!--Run-->
+
See the [[{{PAGENAME}}_PBS]] page for {{#var: app}} PBS script examples.|}}
{{#if: {{#var: exe}}|==Additional Information==
 
 
 
WRITE_ADDITIONAL_INSTRUCTIONS_ON_RUNNING_THE_SOFTWARE_IF_NECESSARY
 
 
 
|}}
 
<!--Job Scripts-->
 
{{#if: {{#var: job}}|==Job Script Examples==
 
See the [[{{PAGENAME}}_Job_Scripts]] page for {{#var: app}} Job script examples.
 
|}}
 
<!--Policy-->
 
 
{{#if: {{#var: policy}}|==Usage Policy==
 
{{#if: {{#var: policy}}|==Usage Policy==
 
+
We have a 64-bit licensed USEARCH binary in the usearch/7.0.1001-64 module.
WRITE USAGE POLICY HERE (Licensing, usage, access).
 
 
 
 
|}}
 
|}}
<!--Performance-->
 
 
{{#if: {{#var: testing}}|==Performance==
 
{{#if: {{#var: testing}}|==Performance==
 
+
WRITE PERFORMANCE TESTING RESULTS HERE|}}
WRITE_PERFORMANCE_TESTING_RESULTS_HERE
 
 
 
|}}
 
<!--Faq-->
 
 
{{#if: {{#var: faq}}|==FAQ==
 
{{#if: {{#var: faq}}|==FAQ==
 
*'''Q:''' **'''A:'''|}}
 
*'''Q:''' **'''A:'''|}}
<!--Citation-->
 
 
{{#if: {{#var: citation}}|==Citation==
 
{{#if: {{#var: citation}}|==Citation==
If you publish research that uses {{#var:app}} you have to cite it as follows:
+
If you publish research that uses {{#var: app}} you have to cite it as follows:
 
+
<pre>
WRITE_CITATION_HERE
+
Edgar, Robert C. - Search and clustering orders of magnitude faster than BLAST
 
+
Bioinformatics, 2010
 +
Author : Edgar, Robert C.
 +
Title : Search and clustering orders of magnitude faster than BLAST
 +
Publication : Bioinformatics
 +
Date : 2010
 +
</pre>
 +
<!--
 +
<bibtex>
 +
@article{Edgar12082010,
 +
author = {Edgar, Robert C.},
 +
title = {Search and clustering orders of magnitude faster than BLAST},
 +
year = {2010},
 +
doi = {10.1093/bioinformatics/btq461},
 +
abstract ={Motivation: Biological sequence data is accumulating rapidly, motivating the development of improved high-throughput methods for sequence classification.Results: UBLAST and USEARCH are new algorithms enabling sensitive local and global search of large sequence databases at exceptionally high speeds. They are often orders of magnitude faster than BLAST in practical applications, though sensitivity to distant protein relationships is lower. UCLUST is a new clustering method that exploits USEARCH to assign sequences to clusters. UCLUST offers several advantages over the widely-used program CD-HIT, including higher speed, lower memory use, improved sensitivity, clustering at lower identities and classification of much larger datasets.Availability: Binaries are available at no charge for non-commercial use at http://www.drive5.com/usearch.Contact: robert@drive5.com.},
 +
URL = {http://bioinformatics.oxfordjournals.org/content/early/2010/08/12/bioinformatics.btq461.abstract},
 +
eprint = {http://bioinformatics.oxfordjournals.org/content/early/2010/08/12/bioinformatics.btq461.full.pdf+html},
 +
journal = {Bioinformatics}
 +
}
 +
</bibtex>-->
 
|}}
 
|}}
<!--Installation-->
 
{{#if: {{#var: installation}}|==Installation==
 
See the [[{{PAGENAME}}_Install]] page for {{#var: app}} installation notes.|}}
 
<!--Turn the Table of Contents and Edit paragraph links ON/OFF-->
 
__NOTOC____NOEDITSECTION__
 

Latest revision as of 17:19, 22 August 2022

Description

usearch website  

USEARCH is a unique high-throughput sequence analysis tool. It is a distributed as single binary program that implements a suite of algorithms comparable to BLASTN, BLASTP, BLASTX, BLASTCLUST, CD-HIT, CD-HIT-EST, CD-HIT-2D, CD-HIT-EST-2D, CD-HIT-OTU, CD-HIT-454, ChimeraSlayer, Perseus, RAPsearch and more. It supports a rich set of sequence matching options, including E-values, identity, coverage (fraction of query or target sequence covered by the alignment) and maximum gap length, and a range of output file formats including FASTA, BLAST-like, user-defined tabbed text and a native format designed for clustering applications. Supported alignment styles include local (gapped and ungapped), like BLAST, and global, which is most often used in clustering applications. User-settable parameters allow tuning of substitution scores, gap penalties and Karlin-Altschul statistics.

Environment Modules

Run module spider usearch to find out what environment modules are available for this application.

System Variables

  • HPC_USEARCH_DIR - installation directory
  • HPC_USEARCH_BIN - executable directory


Usage Policy

We have a 64-bit licensed USEARCH binary in the usearch/7.0.1001-64 module.


Citation

If you publish research that uses usearch you have to cite it as follows:

Edgar, Robert C. - Search and clustering orders of magnitude faster than BLAST
Bioinformatics, 2010
Author : Edgar, Robert C.
Title : Search and clustering orders of magnitude faster than BLAST
Publication : Bioinformatics
Date : 2010