Ea-utils
Revision as of 14:54, 15 August 2022 by Israel.herrera (talk | contribs)
Description
ea-utils are command-line tools for processing biological sequencing data. Barcode demultiplexing, adapter trimming, etc. They are primarily written to support Illumina based pipelines but should work with any FASTQs.
- Overview
- fastq-mcf
- Scans a sequence file for adapters, and, based on a log-scaled threshold, determines a set of clipping parameters and performs clipping. Also does skewing detection and quality filtering.
- fastq-multx
- Demultiplexes a fastq. Capable of auto-determining barcode id's based on a master set fields. Keeps multiple reads in-sync during demultiplexing. Can verify that the reads are in-sync as well, and fail if they're not.
- fastq-join
- Similar to audy's stitch program, but in C, more efficient and supports some automatic benchmarking and tuning. It uses the same "squared distance for anchored alignment" as other tools.
- varcall
- Takes a pileup and calculates variants in a more easily parameterized manner than some other tools.
- sam-stats
- Basic sam/bam stats. Like other tools, but produces what I want to look at, in a format suitable for passing to other programs. (View source)
- fastq-stats
- Basic fastq stats. Counts duplicates. Option for per-cycle stats, or not (irrelevant for many sequencers).
Environment Modules
Run module spider eautils
to find out what environment modules are available for this application.
System Variables
- HPC_EAUTILS_DIR - installation directory
Citation
If you publish research that uses eautils you have to cite it as follows:
Erik Aronesty (2011). ea-utils : "Command-line tools for processing biological sequencing data"; http://code.google.com/p/ea-utils
Erik Aronesty (2013). TOBioiJ : "Comparison of Sequencing Utility Programs", DOI:10.2174/1875036201307010001