Fast-Plast

From UFRC
Jump to navigation Jump to search

Description

fast-plast website  

Fast-Plast is a pipeline that leverages existing and novel programs to quickly assemble, orient, and verify whole chloroplast genome sequences. For most datasets with sufficient data, Fast-Plast is able to produce a full-length de novo chloroplast genome assembly in approximately 30 minutes with no user mediation. In addition to a chloroplast sequence, Fast-Plast identifies chloroplast genes present in the final assembly.

Currently, Fast-Plast is written to accomodate Illumina data, although most data types could be used.

Fast-Plast uses a de novo assembly approach by combining the De Bruijn graph-based method of SPAdes with an iterative seed-based assembly implemented in afin to close gaps of contigs with low coverage. The pipeline then identifies regions from the quadripartite structure of the chloroplast genome, assigns identity, and orders them according to standard convention. A coverage analysis is then conducted to assess the quality of the final assembly.

Environment Modules

Run module spider fast-plast to find out what environment modules are available for this application.

System Variables

  • HPC_FAST-PLAST_DIR - installation directory
  • HPC_FAST-PLAST_BIN - executable directory




Citation

If you publish research that uses fast-plast you have to cite it as follows:

Bankevich, A., S. Nurk, D. Antipov, A. A. Gurevich, M. Dvorkin, A. S. Kulikov, V. M. Lesin, S. I. Niklenko, S. Pham, A. D. Prjibelski, A. V. Pyshkin, A. V. Sirotkin, N. Vyahhi, G. Tesler, M.A. Alexkseyev, and P. A. Pevzner. 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comp. Biol., 19(5):455-477.

Bolger, A. M., M. Lohse, and B. Usadel. 2014. Trimmomatic: A flexible trimmer for Illumina Sequence Data. Bioinformatics, btu170.

Langmead, B. and S. Salzberg. 2012. Fast gapped-read alignment with Bowtie 2. Nature Methods, 9:357-359. Marçais, G. and C. Kingsford. 2011. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics, 27:764-770.