ea-utils are command-line tools for processing biological sequencing data. Barcode demultiplexing, adapter trimming, etc. They are primarily written to support Illumina based pipelines but should work with any FASTQs.
- Scans a sequence file for adapters, and, based on a log-scaled threshold, determines a set of clipping parameters and performs clipping. Also does skewing detection and quality filtering.
- Demultiplexes a fastq. Capable of auto-determining barcode id's based on a master set fields. Keeps multiple reads in-sync during demultiplexing. Can verify that the reads are in-sync as well, and fail if they're not.
- Similar to audy's stitch program, but in C, more efficient and supports some automatic benchmarking and tuning. It uses the same "squared distance for anchored alignment" as other tools.
- Takes a pileup and calculates variants in a more easily parameterized manner than some other tools.
- Basic sam/bam stats. Like other tools, but produces what I want to look at, in a format suitable for passing to other programs. (View source)
- Basic fastq stats. Counts duplicates. Option for per-cycle stats, or not (irrelevant for many sequencers).
If you publish research that uses eautils you have to cite it as follows:
Erik Aronesty (2011). ea-utils : "Command-line tools for processing biological sequencing data"; http://code.google.com/p/ea-utils
Erik Aronesty (2013). TOBioiJ : "Comparison of Sequencing Utility Programs", DOI:10.2174/1875036201307010001
- Validated 4/5/2018