Difference between revisions of "PAUDA"

From UFRC
Jump to navigation Jump to search
m (Text replacement - "#uppercase" to "uc")
Line 34: Line 34:
  
 
==System Variables==
 
==System Variables==
* HPC_{{#uppercase:{{#var:app}}}}_DIR - installation directory
+
* HPC_{{uc:{{#var:app}}}}_DIR - installation directory
 
<!--Configuration-->
 
<!--Configuration-->
 
{{#if: {{#var: conf}}|==Configuration==
 
{{#if: {{#var: conf}}|==Configuration==

Revision as of 21:23, 6 December 2019

Description

PAUDA website  

PAUDA is a new approach toward the problem of comparing DNA reads against a database of protein reference sequences that is applicable to very large datasets consisting of hundreds of millions or billions of reads. PAUDA is an acronym for "Protein Alignment Using a DNA Aligner". The approach allows one to harness the high efficiency of DNA read aligners to compute BLASTX-like alignments between sequencing reads and a protein database in a small fraction of the time required by BLASTX. The PAUDA approach makes it possible to process DNA reads at a rate of millions of reads per CPU hour. PAUDA is 10,000 times faster than BLASTX.

Required Modules

Serial

  • PAUDA

Parallel (OpenMP)

  • intel
  • PAUDA

Parallel (MPI)

  • intel
  • openmpi
  • PAUDA

System Variables

  • HPC_PAUDA_DIR - installation directory




Citation

If you publish research that uses PAUDA you have to cite it as follows:

Daniel H. Huson and Chao Xie, A poor man’s BLASTX - high-throughput metagenomic protein database search using PAUDA, submitted to HitSeq (2013).


Validation

  • Validate 4/5/2018