Difference between revisions of "ExaBayes"

From UFRC
Jump to navigation Jump to search
 
(5 intermediate revisions by 3 users not shown)
Line 1: Line 1:
[[Category:Software]][[Category:Bioinformatics]][[Category:Phylogenetics]]
+
[[Category:Software]][[Category:Phylogenetics]]
 
{|<!--CONFIGURATION: REQUIRED-->
 
{|<!--CONFIGURATION: REQUIRED-->
 
|{{#vardefine:app|exabayes}}
 
|{{#vardefine:app|exabayes}}
Line 10: Line 10:
 
|{{#vardefine:testing|}}      <!--PROFILING-->
 
|{{#vardefine:testing|}}      <!--PROFILING-->
 
|{{#vardefine:faq|}}            <!--FAQ-->
 
|{{#vardefine:faq|}}            <!--FAQ-->
|{{#vardefine:citation|}}      <!--CITATION-->
+
|{{#vardefine:citation|1}}      <!--CITATION-->
 
|{{#vardefine:installation|}} <!--INSTALLATION-->
 
|{{#vardefine:installation|}} <!--INSTALLATION-->
 
|}
 
|}
Line 26: Line 26:
 
The distinguishing feature of ExaBayes is its capability to handle enormous datasets efficiently. ExaBayes provides an implementation of data parallelism using the Message Passing Interface (MPI). This means, that if you conduct your analysis on a computing cluster composed of several machines (a.k.a. nodes), the memory needed to evaluate the likelihood of trees and parameters given a large alignment can be spread out across multiple computing nodes. In conclusion, the size of the concatenated alignment ExaBayes can handle is only limited by the combined main memory of your entire computing cluster.
 
The distinguishing feature of ExaBayes is its capability to handle enormous datasets efficiently. ExaBayes provides an implementation of data parallelism using the Message Passing Interface (MPI). This means, that if you conduct your analysis on a computing cluster composed of several machines (a.k.a. nodes), the memory needed to evaluate the likelihood of trees and parameters given a large alignment can be spread out across multiple computing nodes. In conclusion, the size of the concatenated alignment ExaBayes can handle is only limited by the combined main memory of your entire computing cluster.
  
Aside from that ExaBayes also implements
+
Aside from that ExaBayes also implements chain-level and run-level parallelism, techniques to trade runtime for reduced memory footprint, a subtree equality vector approach that reduces memory without loss of runtime, a native AVX implementation for evaluating likelihood and parsimony scores (i.e., ExaBayes makes full use of your cutting-edge CPU), techniques to efficiently handle an arbitrary number of partitions.
 
 
chain-level and run-level parallelism,
 
techniques to trade runtime for reduced memory footprint,
 
a subtree equality vector approach that reduces memory without loss of runtime,
 
a native AVX implementation for evaluating likelihood and parsimony scores (i.e., ExaBayes makes full use of your cutting-edge CPU),
 
techniques to efficiently handle an arbitrary number of partitions.
 
 
We use the highly efficient parsimony and likelihood implementation of RAxML [4]. Many of the techniques described above are adapted from or inspired by our experiences with large-scale maximum likelihood inferences using RAxML-Light/ExaML [5,6].
 
We use the highly efficient parsimony and likelihood implementation of RAxML [4]. Many of the techniques described above are adapted from or inspired by our experiences with large-scale maximum likelihood inferences using RAxML-Light/ExaML [5,6].
  
 
The ExaBayes package contains all tools necessary for post-processing your sampled chains. For visualization of parameter distributions, we recommend Tracer and FigTree (for which ExaBayes parameter files are compatible).
 
The ExaBayes package contains all tools necessary for post-processing your sampled chains. For visualization of parameter distributions, we recommend Tracer and FigTree (for which ExaBayes parameter files are compatible).
 
;References
 
 
[1] Fredrik Ronquist, Maxim Teslenko, Paul van der Mark, Daniel L Ayres, Aaron Darling, Sebastian Höhna, Bret Larget, Liang Liu, Marc a Suchard, and John P Huelsenbeck. MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Systematic biology, 61(3):539-42, May 2012. [ [http://sco.h-its.org/exelixis/web/software/exabayes/manual/library2_bib.html#Ronquist2012 bib] | [http://dx.doi.org/10.1093/sysbio/sys029 DOI] | [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3329765\&tool=pmcentrez\&rendertype=abstract http] ]
 
 
[2] Alexei J Drummond, Marc a Suchard, Dong Xie, and Andrew Rambaut. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Molecular biology and evolution, 29(8):1969-73, August 2012. [ [http://sco.h-its.org/exelixis/web/software/exabayes/manual/library2_bib.html#Drummond2012 bib] | [http://dx.doi.org/10.1093/molbev/mss075 DOI] | [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3408070\&tool=pmcentrez\&rendertype=abstract http] ]
 
 
[3] Clemens Lakner, Paul van der Mark, John P Huelsenbeck, Bret Larget, and Fredrik Ronquist. Efficiency of Markov chain Monte Carlo tree proposals in Bayesian phylogenetics. Systematic biology, 57(1):86-103, February 2008. [ [http://sco.h-its.org/exelixis/web/software/exabayes/manual/library2_bib.html#Lakner2008a bib] | [http://dx.doi.org/10.1080/10635150801886156 DOI] | [http://www.ncbi.nlm.nih.gov/pubmed/18278678 http] ]
 
 
[4] Alexandros Stamatakis. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics, 22(21):2688-2690, November 2006. [ [http://sco.h-its.org/exelixis/web/software/exabayes/manual/library2_bib.html#Stamatakis2006 bib] | [http://dx.doi.org/10.1093/bioinformatics/btl446 DOI] | [http://dx.doi.org/10.1093/bioinformatics/btl446 http] ]
 
 
[5] Alexandros Stamatakis, Andre J Aberer, Christian Goll, Stephen A Smith, Simon A Berger, and Fernando Izquierdo-Carrasco. RAxML-Light: a tool for computing terabyte phylogenies. Bioinformatics (Oxford, England), 28(15):2064-6, August 2012. [ [http://sco.h-its.org/exelixis/web/software/exabayes/manual/library2_bib.html#Stamatakis2012 bib] | [http://dx.doi.org/10.1093/bioinformatics/bts309 DOI] | [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3400957\&tool=pmcentrez\&rendertype=abstract http] ]
 
 
<!--Modules-->
 
<!--Modules-->
 
==Required Modules==
 
==Required Modules==
Line 63: Line 45:
 
* {{#var:app}}
 
* {{#var:app}}
 
==System Variables==
 
==System Variables==
* HPC_{{#uppercase:{{#var:app}}}}_DIR - installation directory
+
* HPC_{{uc:{{#var:app}}}}_DIR - installation directory
* HPC_{{#uppercase:{{#var:app}}}}_BIN - executable directory
+
* HPC_{{uc:{{#var:app}}}}_BIN - executable directory
* HPC_{{#uppercase:{{#var:app}}}}_DOC - documentation directory
+
* HPC_{{uc:{{#var:app}}}}_DOC - documentation directory
 
<!--Configuration-->
 
<!--Configuration-->
 
{{#if: {{#var: conf}}|==Configuration==
 
{{#if: {{#var: conf}}|==Configuration==
Line 90: Line 72:
 
*'''Q:''' **'''A:'''|}}
 
*'''Q:''' **'''A:'''|}}
 
<!--Citation-->
 
<!--Citation-->
{{#if: {{#var: citation}}|==Citation==
+
==Citation==
 
If you publish research that uses {{#var:app}} you have to cite it as follows:
 
If you publish research that uses {{#var:app}} you have to cite it as follows:
WRITE_CITATION_HERE
+
# Fredrik Ronquist, Maxim Teslenko, Paul van der Mark, Daniel L Ayres, Aaron Darling, Sebastian Höhna, Bret Larget, Liang Liu, Marc a Suchard, and John P Huelsenbeck. MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Systematic biology, 61(3):539-42, May 2012. [[http://sco.h-its.org/exelixis/web/software/exabayes/manual/library2_bib.html#Ronquist2012 bib] | [http://dx.doi.org/10.1093/sysbio/sys029 DOI] | [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3329765\&tool=pmcentrez\&rendertype=abstract http] ]
|}}
+
#Alexei J Drummond, Marc a Suchard, Dong Xie, and Andrew Rambaut. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Molecular biology and evolution, 29(8):1969-73, August 2012. [ [http://sco.h-its.org/exelixis/web/software/exabayes/manual/library2_bib.html#Drummond2012 bib] | [http://dx.doi.org/10.1093/molbev/mss075 DOI] | [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3408070\&tool=pmcentrez\&rendertype=abstract http] ]
 +
#Clemens Lakner, Paul van der Mark, John P Huelsenbeck, Bret Larget, and Fredrik Ronquist. Efficiency of Markov chain Monte Carlo tree proposals in Bayesian phylogenetics. Systematic biology, 57(1):86-103, February 2008. [ [http://sco.h-its.org/exelixis/web/software/exabayes/manual/library2_bib.html#Lakner2008a bib] | [http://dx.doi.org/10.1080/10635150801886156 DOI] | [http://www.ncbi.nlm.nih.gov/pubmed/18278678 http] ]
 +
#Alexandros Stamatakis. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics, 22(21):2688-2690, November 2006. [ [http://sco.h-its.org/exelixis/web/software/exabayes/manual/library2_bib.html#Stamatakis2006 bib] | [http://dx.doi.org/10.1093/bioinformatics/btl446 DOI] | [http://dx.doi.org/10.1093/bioinformatics/btl446 http] ]
 +
#Alexandros Stamatakis, Andre J Aberer, Christian Goll, Stephen A Smith, Simon A Berger, and Fernando Izquierdo-Carrasco. RAxML-Light: a tool for computing terabyte phylogenies. Bioinformatics (Oxford, England), 28(15):2064-6, August 2012. [ [http://sco.h-its.org/exelixis/web/software/exabayes/manual/library2_bib.html#Stamatakis2012 bib] | [http://dx.doi.org/10.1093/bioinformatics/bts309 DOI] | [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3400957\&tool=pmcentrez\&rendertype=abstract http] ]
 
<!--Installation-->
 
<!--Installation-->
 
{{#if: {{#var: installation}}|==Installation==
 
{{#if: {{#var: installation}}|==Installation==

Latest revision as of 21:12, 21 December 2022

Description

exabayes website  

User Manual is available online or in the $HPC_EXABAYES_DOC directory.

ExaBayes is a tool for Bayesian phylogenetic analyses. It implements a Markov chain Monte Carlo sampling approach that allows to determine the posterior probability of a tree (resp., topology) and various evolutionary model parameters, for instance, branch lengths or substitution rates. Similar approaches are implemented in BEAST [2] or MrBayes [1]. ExaBayes has heavily drawn inspiration specifically from the latter one.

ExaBayes comes with the most commonly used evolutionary models, such as the generalized time reversible model (GTR) of character substitution, the discretized Γ model of among site rate heterogeneity and estimates trees with unconstrained branch lengths. For clocked tree models or less parameter-rich substitution models, we refer you to the established tools.

The distinguishing feature of ExaBayes is its capability to handle enormous datasets efficiently. ExaBayes provides an implementation of data parallelism using the Message Passing Interface (MPI). This means, that if you conduct your analysis on a computing cluster composed of several machines (a.k.a. nodes), the memory needed to evaluate the likelihood of trees and parameters given a large alignment can be spread out across multiple computing nodes. In conclusion, the size of the concatenated alignment ExaBayes can handle is only limited by the combined main memory of your entire computing cluster.

Aside from that ExaBayes also implements chain-level and run-level parallelism, techniques to trade runtime for reduced memory footprint, a subtree equality vector approach that reduces memory without loss of runtime, a native AVX implementation for evaluating likelihood and parsimony scores (i.e., ExaBayes makes full use of your cutting-edge CPU), techniques to efficiently handle an arbitrary number of partitions. We use the highly efficient parsimony and likelihood implementation of RAxML [4]. Many of the techniques described above are adapted from or inspired by our experiences with large-scale maximum likelihood inferences using RAxML-Light/ExaML [5,6].

The ExaBayes package contains all tools necessary for post-processing your sampled chains. For visualization of parameter distributions, we recommend Tracer and FigTree (for which ExaBayes parameter files are compatible).

Required Modules

Serial

  • gcc/4.7.2
  • exabayes

Parallel (MPI)

  • gcc/4.7.2
  • openmpi/1.6.5
  • exabayes

System Variables

  • HPC_EXABAYES_DIR - installation directory
  • HPC_EXABAYES_BIN - executable directory
  • HPC_EXABAYES_DOC - documentation directory




Citation

If you publish research that uses exabayes you have to cite it as follows:

  1. Fredrik Ronquist, Maxim Teslenko, Paul van der Mark, Daniel L Ayres, Aaron Darling, Sebastian Höhna, Bret Larget, Liang Liu, Marc a Suchard, and John P Huelsenbeck. MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Systematic biology, 61(3):539-42, May 2012. [bib | DOI | http ]
  2. Alexei J Drummond, Marc a Suchard, Dong Xie, and Andrew Rambaut. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Molecular biology and evolution, 29(8):1969-73, August 2012. [ bib | DOI | http ]
  3. Clemens Lakner, Paul van der Mark, John P Huelsenbeck, Bret Larget, and Fredrik Ronquist. Efficiency of Markov chain Monte Carlo tree proposals in Bayesian phylogenetics. Systematic biology, 57(1):86-103, February 2008. [ bib | DOI | http ]
  4. Alexandros Stamatakis. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics, 22(21):2688-2690, November 2006. [ bib | DOI | http ]
  5. Alexandros Stamatakis, Andre J Aberer, Christian Goll, Stephen A Smith, Simon A Berger, and Fernando Izquierdo-Carrasco. RAxML-Light: a tool for computing terabyte phylogenies. Bioinformatics (Oxford, England), 28(15):2064-6, August 2012. [ bib | DOI | http ]