Jump to navigation Jump to search


rascaf website  

Rascaf (RnA-seq SCAFfolder) uses continuity and order information from paired-end RNA-seq reads to improve a draft assembly, particularly in the gene regions. It takes as input an assembly and one or several RNA-seq data sets aligned to the genome, and recruits additional contigs into the assembly, potentially adjusting some scaffolds to better fit the data and to create longer gene models. Rascaf works in three stages. It first computes a set of candidate contig connections from the raw (original) assembly that are supported by the RNA-seq data. Then, in an optional step, the user can choose to validate and filter the connections by searching the merged gene sequences against public sequence databases. Finally, Rascaf uses these connections to select and/or re-arrange additional contigs within scaffolds and chromosomes. When Rascaf is run with multiple RNA-seq data sets, it first generates a set of connections for each set independently. Rascaf then reconciles all connections during a 'join' step that detects and resolves any conflicts.

Required Modules


  • gcc/5.2.0
  • rascaf

System Variables

  • HPC_RASCAF_DIR - installation directory


If you publish research that uses rascaf you have to cite it as follows:

Song, L., Shankar, D. and Florea, L. "Rascaf: Improving Genome Assembly with RNA-seq Data", The Plant Genome, 2016. doi: 10.3835/plantgenome2016.03.0027.