Themisto

From UFRC
Revision as of 20:14, 7 July 2021 by Johnbullard (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Description

themisto website  

A metanenomic sample is a set of sequences of reads from microbial life living in a particular environment. Standard analysis involves estimating the species composition of the environment by aligning the reads against a reference database. Since the age of pangenomics, alignment is preferentially done against a variation graph encompassing all variation within a species.

Themisto is a space-efficient tool for indexing such variation graphs. The Themisto index is a compressed colored de-bruijn graph of order k, where each node has a set of colors representing the reference sequences that contain the k-mer corresponding to the node. Reads are pseudoaligned to the index using a method similar to the one used by the tool Kallisto: all k-mers of the read are located in the de-bruijn graph and the intersection of the color sets of the nodes is returned.

Environment Modules

Run module spider themisto to find out what environment modules are available for this application.

System Variables

  • HPC_THEMISTO_DIR - installation directory
  • HPC_THEMISTO_BIN - executable directory




Citation

If you publish research that uses themisto you have to cite it as follows:

Tommi Mäklin, Teemu Kallonen, Jarno Alanko, Veli Mäkinen, Jukka Corander, Antti Honkela. Genomic Epidemiology with Mixed Samples. Supplement: Pseudoalignment in the mGEMS pipeline.