Difference between revisions of "Sailfish"
Moskalenko (talk | contribs) m (Text replacement - "#uppercase" to "uc") |
|||
Line 37: | Line 37: | ||
--> | --> | ||
==System Variables== | ==System Variables== | ||
− | * HPC_{{ | + | * HPC_{{uc:{{#var:app}}}}_DIR |
<!--Configuration--> | <!--Configuration--> | ||
{{#if: {{#var: conf}}|==Configuration== | {{#if: {{#var: conf}}|==Configuration== |
Revision as of 21:24, 6 December 2019
Description
RNA-seq expression estimates need not take longer than a cup of coffee
The quantification of gene or isoform abundance is a fundamental step in many transcriptome analysis tasks, such as determining differential expression between biological samples. Yet, estimating isoform abundance from a large set of RNA-seq reads remains a computationally intensive task, owing in large part to the necessity of read mapping. To address this problem directly, we developed Sailfish, a software tool that implements a novel, alignment-free algorithm for the estimation of isoform abundances directly from a set of reference sequences and RNA-seq reads. Rather than working at the read level, the fundamental unit of transcript coverage in Sailfish is the k-mer. Implementing this alternative, lightweight, approach allows Sailfish to dispense with many of the complexities of read mapping while remaining robust to sequencing errors. By replacing read mapping with intelligent k-mer indexing and counting, Sailfish is able to quantify isoform abundance orders of magnitude faster than existing tools. For example, it takes about 15 minutes for a set of 150 million reads where existing tools take over 6 hours.
This increase in speed is obtained without sacrificing accuracy. Sailfish implements an efficient, accelerated expectation-maximization algorithm for quantifying isoform abundance that produces high-quality results, and is capable of correcting numerous types of systematic bias that are known to occur in RNA-seq experiments. In the paper, we demonstrate that, on both real and synthetic data, Sailfish is as accurate as existing read mapping-based tools such as eXpress and Cufflinks.
Required Modules
Serial
- sailfish
System Variables
- HPC_SAILFISH_DIR
Validation
- Validated 4/5/2018