Difference between revisions of "Metabin"

From UFRC
Jump to navigation Jump to search
 
(8 intermediate revisions by 4 users not shown)
Line 1: Line 1:
[[Category:Software]]
+
[[Category:Software]][[Category:Genomics]][[Category:Biology]]
 
{|<!--CONFIGURATION: REQUIRED-->
 
{|<!--CONFIGURATION: REQUIRED-->
 
|{{#vardefine:app|metabin}}
 
|{{#vardefine:app|metabin}}
Line 6: Line 6:
 
|{{#vardefine:conf|}}          <!--CONFIGURATION-->
 
|{{#vardefine:conf|}}          <!--CONFIGURATION-->
 
|{{#vardefine:exe|1}}            <!--ADDITIONAL INFO-->
 
|{{#vardefine:exe|1}}            <!--ADDITIONAL INFO-->
|{{#vardefine:pbs|1}}            <!--PBS SCRIPTS-->
+
|{{#vardefine:pbs|}}            <!--PBS SCRIPTS-->
 
|{{#vardefine:policy|}}        <!--POLICY-->
 
|{{#vardefine:policy|}}        <!--POLICY-->
 
|{{#vardefine:testing|}}      <!--PROFILING-->
 
|{{#vardefine:testing|}}      <!--PROFILING-->
Line 26: Line 26:
  
 
<!--Modules-->
 
<!--Modules-->
==Required Modules==
+
==Environment Modules==
 
+
Run <code>module spider {{#var:app}}</code> to find out what environment modules are available for this application.
===Serial===
 
* {{#var:app}}
 
<!--
 
===Parallel (OpenMP)===
 
* intel
 
* {{#var:app}}
 
===Parallel (MPI)===
 
* intel
 
* openmpi
 
* {{#var:app}}
 
-->
 
 
==System Variables==
 
==System Variables==
* HPC_{{#uppercase:{{#var:app}}}}_DIR
+
* HPC_{{uc:{{#var:app}}}}_DIR - installation directory
 
<!--Configuration-->
 
<!--Configuration-->
 
{{#if: {{#var: conf}}|==Configuration==
 
{{#if: {{#var: conf}}|==Configuration==
Line 47: Line 36:
 
<!--Run-->
 
<!--Run-->
 
{{#if: {{#var: exe}}|==Additional Information==
 
{{#if: {{#var: exe}}|==Additional Information==
Metabin uses Jim Kent's [http://genome-test.cse.ucsc.edu/~kent/exe/ Blat] application as an alignment method that is much faster than Blastx. Unfortunately, Blat has a bug which causes it to crash with large reference databases, like nr (see the [[Blat|Blat wiki page]]. As a workaround, we suggest dividing large reference databases into multiple files. To implement this, it is necessary to run the prepareinput step of Metabin with the "-b n" option, and run the Blat analyses separately. An example PBS script is provided to demonstrate how to do this.
+
Metabin uses Jim Kent's [http://genome.ucsc.edu/FAQ/FAQblat.html Blat] application as an alignment method that is much faster than Blastx. Unfortunately, Blat has a bug which causes it to crash with large reference databases, like nr (see the [[BLAT|Blat wiki page]]). As a workaround, we suggest dividing large reference databases into multiple files. To implement this, it is necessary to run the prepareinput step of Metabin with the "-b n" option, and run the Blat analyses separately. An example PBS script is provided to demonstrate how to do this.
 
|}}
 
|}}
 
<!--PBS scripts-->
 
<!--PBS scripts-->

Latest revision as of 19:25, 18 August 2022

Description

metabin website  

MetaBin: a program for accurate, fast and highly sensitive taxonomic assignments of metagenomic sequences For comprehensive taxonomic binning, we developed the ‘MetaBin’ web server and standalone program for faster and more accurate taxonomic assignment of single and paired-end sequence reads of varying lengths (≥45 bp) obtained from both Sanger and next-generation sequencing platforms. We benchmarked it using both simulated reads (> 1 million) and real metagenomic datasets. MetaBin correctly assigns a higher number of reads to their expected taxonomic lineages with a lower error frequency as compared to other methods. It displays high accuracy (positive predictive value (PPV) ≥99%) along with high sensitivity (≥94%) for various read lengths. In particular, for short Illumina reads (~45-75 bp) it makes about 4% more assignments as compared to its closest competitors with near 100% accuracy when reference genomes are available.

By implementing Blat a faster alignment method as opposed to Blastx (though both options are available), the analysis time is reduced by 50-1000 times, which is comparable or faster than the time taken for analysis by usually faster composition-based methods. This feature makes it practical to use a more accurate and sensitive homology-based approach for high-throughput analysis of large datasets by removing the bottleneck of time required to generate alignments using Blastx. The MetaBin web server allows users to upload their own data, as sequence reads or Blastx output, to carry out taxonomic analysis. It provides several visualization options for constructing a taxonomic tree of the results, and for performing comparative analysis of the taxonomic profiles for multiple metagenomic datasets.

The standalone command line version is installed.

Environment Modules

Run module spider metabin to find out what environment modules are available for this application.

System Variables

  • HPC_METABIN_DIR - installation directory

Additional Information

Metabin uses Jim Kent's Blat application as an alignment method that is much faster than Blastx. Unfortunately, Blat has a bug which causes it to crash with large reference databases, like nr (see the Blat wiki page). As a workaround, we suggest dividing large reference databases into multiple files. To implement this, it is necessary to run the prepareinput step of Metabin with the "-b n" option, and run the Blat analyses separately. An example PBS script is provided to demonstrate how to do this.



Citation

If you publish research that uses metabin you have to cite it as follows: Sharma, V.K., Kumar, N., Prakash, T., Taylor, T.D., 2012. Fast and Accurate Taxonomic Assignments of Metagenomic Sequences Using MetaBin. PLoS One 7.