ABruijn

Description

ABruijn is a de novo assembler for PacBio and Oxford Nanopore Technologies reads. The algorithm uses an A-Bruijn graph to find the overlaps between reads and does not require them to be error-corrected. First, the algorithm produces a draft assembly by concatenating different parts of raw reads. This coarse sequence is then polished into a high quality assembly.

ABruijn works for both bacterial and eukaryotic genomes. Typically, assembly of a bacteria with 50x coverage takes less than an hour on a modern desktop, while yeast assembly takes about 5 hours. A eukaryotic genome of size 200 Mbp can be assembled within a day on a computational server.

Required Modules

Parallel (OpenMP)

gcc/5.2.0
abruijn

System Variables

HPC_{{#uppercase:abruijn}}_DIR - installation directory

Citation

If you publish research that uses abruijn you have to cite it as follows:

Yu Lin, Jeffrey Yuan, Mikhail Kolmogorov, Max W Shen, Pavel Pevzner, "Assembly of Long Error-Prone Reads Using de Bruijn Graphs" (http://biorxiv.org/content/early/2016/04/13/048413)