Difference between revisions of "SAMStat"

From UFRC
Jump to navigation Jump to search
(Created page with "Category:SoftwareCategory:biologyCategory:bioinformaticsCategory:ngsCategory:sequencing {|<!--CONFIGURATION: REQUIRED--> |{{#vardefine:app|samstat}} |{{#va...")
 
m (Text replacement - "#uppercase" to "uc")
Line 28: Line 28:
 
Run <code>module spider {{#var:app}}</code> to find out what environment modules are available for this application.
 
Run <code>module spider {{#var:app}}</code> to find out what environment modules are available for this application.
 
==System Variables==
 
==System Variables==
* HPC_{{#uppercase:{{#var:app}}}}_DIR - installation directory
+
* HPC_{{uc:{{#var:app}}}}_DIR - installation directory
* HPC_{{#uppercase:{{#var:app}}}}_BIN - executable directory
+
* HPC_{{uc:{{#var:app}}}}_BIN - executable directory
 
<!--Configuration-->
 
<!--Configuration-->
 
{{#if: {{#var: conf}}|==Configuration==
 
{{#if: {{#var: conf}}|==Configuration==

Revision as of 21:24, 6 December 2019

Description

samstat website  

Next generation sequencing is being applied to understand individual variation, the RNA output of a cell and epigenetic regulation. The millions of sequenced reads are commonly stored in fasta, fastq and after mapping to a reference genome in the alignment / map format (SAM/BAM). To monitor the sequence quality over time and to identify problems it is necessary to report various statistics of the reads at different stages during processing.

SAMStat is an efficient C program to quickly display statistics of large sequence files from next generation sequencing projects. When applied to SAM/BAM files all statistics are reported for unmapped, poorly and accurately mapped reads separately. This allows for identification of a variety of problems, such as remaining linker and adaptor sequences, causing poor mapping. Apart from this SAMStat can be used to verify individual processing steps in large analysis pipelines.

SAMStat reports nucleotide composition, length distribution, base quality distribution, mapping statistics, mismatch, insertion and deletion error profiles, di-nucleotide and 10-mer over-representation. The output is a single html5 page which can be interpreted by a non-specialist.

Environment Modules

Run module spider samstat to find out what environment modules are available for this application.

System Variables

  • HPC_SAMSTAT_DIR - installation directory
  • HPC_SAMSTAT_BIN - executable directory