Reptile

Description

Reptile is a software developed in C++ for correcting sequencing errors in short reads from next-gen sequencing platforms. Reptile has several favorable properties:

Memory efficiency. Reptile can process input data with sizes larger than main memory. For instance, to process a 160x coverage (3.8GB) Illumina data for E. coli it requires only ~1GB memory, which is easily available in a desktop computer.
High speed. Processing Illumina data for a microbe typically takes 0.5hr ~ 2hrs, depending on the number and the quality of reads.
Can handle reads containing non-acgt characters and reads with non-equal length.
Makes simple use of quality score information.
Reptile has been developed by Xiao Yang, Karin Dorman and Srinivas Aluru.

Upstream documentation for reptile.

Environment Modules

Run module spider reptile to find out what environment modules are available for this application.

System Variables

HPC_REPTILE_DIR - installation directory
HPC_REPTILE_BIN - Executable directory
HPC_REPTILE_CONF - Sample configuration files

Additional Information

Make sure that the environmental variable OMP_NUM_THREADS is set to the number of threads you wish to use when executing Reptile when using the reptile-omp binary.

There is a reptile tutorial written by Daniel S. Standage.

The available binaries and scripts include:

fastq-converter reptile_merger reptile-omp reptile-omp-intel reptile-v1.1 seq-analy.

Sample configuration files are in $HPC_REPTILE_CONF

Citation

If you publish research that uses reptile you have to cite it as follows: X. Yang, K. Dorman and S. Aluru, “Reptile: Representative tiling for short read error correction”, Bioinformatics, 26(20), 2526-2533, 2010.