Difference between revisions of "Khmer"

From UFRC
Jump to navigation Jump to search
m (Text replace - "{{#if: {{#var: mod}}|==Execution Environment and Modules== {{App_Module|app={{#var:app}}|intel={{#var:intel}}|mpi={{#var:mpi}}}}|}}" to "==Required Modules== modules documentation ===Serial=== *{{#var:app}}")
m (Text replacement - "#uppercase" to "uc")
(12 intermediate revisions by 3 users not shown)
Line 2: Line 2:
 
__NOEDITSECTION__
 
__NOEDITSECTION__
 
[[Category:Software]][[Category:Bioinformatics]][[Category:NGS]]
 
[[Category:Software]][[Category:Bioinformatics]][[Category:NGS]]
<!-- ########  Template Configuration ######## -->
+
{|<!--Main settings - REQUIRED-->
<!--Edit definitions of the variables used in template calls
 
Required variables:
 
app - lowercase name of the application e.g. "amber"
 
url - url of the software page (project, company product, etc) - e.g. "http://ambermd.org/"
 
Optional variables:
 
INTEL - Version of the Intel Compiler e.g. "11.1"
 
MPI - MPI Implementation and version e.g. "openmpi/1.3.4"
 
-->
 
{|
 
<!--Main settings - REQUIRED-->
 
 
|{{#vardefine:app|khmer}}
 
|{{#vardefine:app|khmer}}
|{{#vardefine:url|https://github.com/ctb/khmer}}
+
|{{#vardefine:url|https://github.com/ged-lab/khmer}}
<!--Compiler and MPI settings - OPTIONAL -->
 
|{{#vardefine:intel|}} <!-- E.g. "11.1" -->
 
|{{#vardefine:mpi|}} <!-- E.g. "openmpi/1.3.4" -->
 
<!--Choose sections to enable - OPTIONAL-->
 
|{{#vardefine:mod|1}} <!--Present instructions for running the software with modules -->
 
 
|{{#vardefine:exe|}} <!--Present manual instructions for running the software -->
 
|{{#vardefine:exe|}} <!--Present manual instructions for running the software -->
 
|{{#vardefine:conf|}} <!--Enable config wiki page link - {{#vardefine:conf|1}} = ON/conf|}} = OFF-->
 
|{{#vardefine:conf|}} <!--Enable config wiki page link - {{#vardefine:conf|1}} = ON/conf|}} = OFF-->
Line 26: Line 11:
 
|{{#vardefine:testing|}} <!--Enable performance testing/profiling section -->
 
|{{#vardefine:testing|}} <!--Enable performance testing/profiling section -->
 
|{{#vardefine:faq|}} <!--Enable FAQ section -->
 
|{{#vardefine:faq|}} <!--Enable FAQ section -->
|{{#vardefine:citation|}} <!--Enable Reference/Citation section -->
+
|{{#vardefine:citation|1}} <!--Enable Reference/Citation section -->
 
|}
 
|}
 
<!-- ########  Template Body ######## -->
 
<!-- ########  Template Body ######## -->
Line 32: Line 17:
 
{{#if: {{#var: url}}|
 
{{#if: {{#var: url}}|
 
{{App_Description|app={{#var:app}}|url={{#var:url}}|name={{#var:app}}}}|}}
 
{{App_Description|app={{#var:app}}|url={{#var:url}}|name={{#var:app}}}}|}}
Khmer - python scripts for k-mer counting, filtering and graph traversal.
 
  
There's a khmer mailing list at librelist.com that you can use to get help with khmer. To sign up, email 'khmer@librelist.com'  to subscribe; then send your question/comment there.
+
Khmer - python scripts for k-mer counting, filtering and graph traversal.
 
 
'''IMPORTANT NOTE:'''
 
 
 
khmer is *pre-publication* and *research* software, so please keep in mind that (a) the code may have undiscovered bugs in it, (b) you should cite us, and (c) you should get in touch if you need to cite us, as we are writing up the project.
 
  
 
Available scripts: abundance-dist.py, count-median.py, do-partition.sh, filter-abund.py, find-knots.py, load-into-counting.py, merge-partitions.py, normalize-by-median.py, partition-graph.py, annotate-partitions.py, count-overlap.py, extract-partitions.py, filter-stoptags.py, load-graph.py, make-initial-stoptags.py, normalize-by-kadian.py, normalize-by-min.py
 
Available scripts: abundance-dist.py, count-median.py, do-partition.sh, filter-abund.py, find-knots.py, load-into-counting.py, merge-partitions.py, normalize-by-median.py, partition-graph.py, annotate-partitions.py, count-overlap.py, extract-partitions.py, filter-stoptags.py, load-graph.py, make-initial-stoptags.py, normalize-by-kadian.py, normalize-by-min.py
  
To use the khmer module make sure python/2.7.2 is loaded and use "import khmer" in your script or in an interactive python session.
+
Use "import khmer" in your script or in an interactive python session.
 
 
 
<!--Modules-->
 
<!--Modules-->
 
==Required Modules==
 
==Required Modules==
Line 49: Line 28:
 
===Serial===
 
===Serial===
 
*{{#var:app}}
 
*{{#var:app}}
* HPC_KHMER_BIN - directory where the scripts are located
+
==System Variables==
* HPC_KHMER_DOC - khmer documents are in this directory
+
* HPC_{{uc:{{#var:app}}}}_DIR
* HPC_KHMER_DATA - sample datasets are in this directory
+
* HPC_KHMER_BIN
 +
* HPC_KHMER_LIB
 
{{#if: {{#var: exe}}|==How To Run==
 
{{#if: {{#var: exe}}|==How To Run==
 
WRITE INSTRUCTIONS ON RUNNING THE ACTUAL BINARY|}}
 
WRITE INSTRUCTIONS ON RUNNING THE ACTUAL BINARY|}}
Line 65: Line 45:
 
*'''Q:''' **'''A:'''|}}
 
*'''Q:''' **'''A:'''|}}
 
{{#if: {{#var: citation}}|==Citation==
 
{{#if: {{#var: citation}}|==Citation==
If you publish research that uses {{{app}}} you have to cite it as follows:
+
If you use the khmer software, you must cite:
WRITE CITATION HERE
+
 
 +
: Crusoe et al., The khmer software package: enabling efficient sequence analysis. 2014. doi: 10.6084/m9.figshare.979190
 +
 
 +
If you use any of khmer's published scientific methods, you should *also* cite the relevant paper(s), as directed below.
 +
 
 +
* Graph partitioning and/or compressible graph representation
 +
: The load-graph.py, partition-graph.py, find-knots.py, load-graph.py, and partition-graph.py scripts are part of the compressible graph representation and partitioning algorithms described in:
 +
:: Pell J, Hintze A, Canino-Koning R, Howe A, Tiedje JM, Brown CT
 +
:: Proc Natl Acad Sci U S A. 2012 Aug 14;109(33):13272-7
 +
:: doi: 10.1073/pnas.1121464109
 +
:: PMID: 22847406
 +
* Digital normalization
 +
: The normalize-by-median.py and count-median.py scripts are part of the digital normalization algorithm, described in:
 +
:: A Reference-Free Algorithm for Computational Normalization of Shotgun Sequencing Data
 +
:: Brown CT, Howe AC, Zhang Q, Pyrkosz AB, Brom TH
 +
:: arXiv:1203.4802 [q-bio.GN]
 +
:: http://arxiv.org/abs/1203.4802
 +
* K-mer counting
 +
: The abundance-dist.py, filter-abund.py, and load-into-counting.py scripts implement the probabilistic k-mer counting described in:
 +
:: These are not the k-mers you are looking for: efficient online k-mer counting using a probabilistic data structure
 +
:: Zhang Q, Pell J, Canino-Koning R, Howe AC, Brown CT.
 +
:: arXiv:1309.2975 [q-bio.GN]
 +
:: http://arxiv.org/abs/1309.2975
 +
 
 
|}}
 
|}}
 +
=Validation=
 +
* Validated 4/5/2018

Revision as of 21:21, 6 December 2019

Description

khmer website  

Khmer - python scripts for k-mer counting, filtering and graph traversal.

Available scripts: abundance-dist.py, count-median.py, do-partition.sh, filter-abund.py, find-knots.py, load-into-counting.py, merge-partitions.py, normalize-by-median.py, partition-graph.py, annotate-partitions.py, count-overlap.py, extract-partitions.py, filter-stoptags.py, load-graph.py, make-initial-stoptags.py, normalize-by-kadian.py, normalize-by-min.py

Use "import khmer" in your script or in an interactive python session.

Required Modules

modules documentation

Serial

  • khmer

System Variables

  • HPC_KHMER_DIR
  • HPC_KHMER_BIN
  • HPC_KHMER_LIB




Citation

If you use the khmer software, you must cite:

Crusoe et al., The khmer software package: enabling efficient sequence analysis. 2014. doi: 10.6084/m9.figshare.979190

If you use any of khmer's published scientific methods, you should *also* cite the relevant paper(s), as directed below.

  • Graph partitioning and/or compressible graph representation
The load-graph.py, partition-graph.py, find-knots.py, load-graph.py, and partition-graph.py scripts are part of the compressible graph representation and partitioning algorithms described in:
Pell J, Hintze A, Canino-Koning R, Howe A, Tiedje JM, Brown CT
Proc Natl Acad Sci U S A. 2012 Aug 14;109(33):13272-7
doi: 10.1073/pnas.1121464109
PMID: 22847406
  • Digital normalization
The normalize-by-median.py and count-median.py scripts are part of the digital normalization algorithm, described in:
A Reference-Free Algorithm for Computational Normalization of Shotgun Sequencing Data
Brown CT, Howe AC, Zhang Q, Pyrkosz AB, Brom TH
arXiv:1203.4802 [q-bio.GN]
http://arxiv.org/abs/1203.4802
  • K-mer counting
The abundance-dist.py, filter-abund.py, and load-into-counting.py scripts implement the probabilistic k-mer counting described in:
These are not the k-mers you are looking for: efficient online k-mer counting using a probabilistic data structure
Zhang Q, Pell J, Canino-Koning R, Howe AC, Brown CT.
arXiv:1309.2975 [q-bio.GN]
http://arxiv.org/abs/1309.2975

Validation

  • Validated 4/5/2018