Difference between revisions of "Khmer"

Revision as of 19:28, 12 August 2022

Description

Khmer - python scripts for k-mer counting, filtering and graph traversal.

Available scripts: abundance-dist.py, count-median.py, do-partition.sh, filter-abund.py, find-knots.py, load-into-counting.py, merge-partitions.py, normalize-by-median.py, partition-graph.py, annotate-partitions.py, count-overlap.py, extract-partitions.py, filter-stoptags.py, load-graph.py, make-initial-stoptags.py, normalize-by-kadian.py, normalize-by-min.py

Use "import khmer" in your script or in an interactive python session.

Environment Modules

Run module spider khmer to find out what environment modules are available for this application.

System Variables

HPC_KHMER_DIR - installation directory
HPC_KHMER_BIN
HPC_KHMER_LIB

Citation

If you use the khmer software, you must cite:

Crusoe et al., The khmer software package: enabling efficient sequence analysis. 2014. doi: 10.6084/m9.figshare.979190

If you use any of khmer's published scientific methods, you should *also* cite the relevant paper(s), as directed below.

Graph partitioning and/or compressible graph representation

The load-graph.py, partition-graph.py, find-knots.py, load-graph.py, and partition-graph.py scripts are part of the compressible graph representation and partitioning algorithms described in:

Pell J, Hintze A, Canino-Koning R, Howe A, Tiedje JM, Brown CT

Proc Natl Acad Sci U S A. 2012 Aug 14;109(33):13272-7

doi: 10.1073/pnas.1121464109

PMID: 22847406

Digital normalization

The normalize-by-median.py and count-median.py scripts are part of the digital normalization algorithm, described in:

A Reference-Free Algorithm for Computational Normalization of Shotgun Sequencing Data

Brown CT, Howe AC, Zhang Q, Pyrkosz AB, Brom TH

arXiv:1203.4802 [q-bio.GN]

http://arxiv.org/abs/1203.4802

K-mer counting

The abundance-dist.py, filter-abund.py, and load-into-counting.py scripts implement the probabilistic k-mer counting described in:

These are not the k-mers you are looking for: efficient online k-mer counting using a probabilistic data structure

Zhang Q, Pell J, Canino-Koning R, Howe AC, Brown CT.

arXiv:1309.2975 [q-bio.GN]

http://arxiv.org/abs/1309.2975

@@ Line 1: / Line 1: @@
 __NOTOC__
 __NOEDITSECTION__
-[[Category:Software]][[Category:Bioinformatics]][[Category:NGS]]
+[[Category:Software]][[Category:Biology]][[Category:NGS]]
 {|<!--Main settings - REQUIRED-->
 |{{#vardefine:app|khmer}}