ExaBayes

From UFRC
Jump to navigation Jump to search

Description

exabayes website  

User Manual is available online or in the $HPC_EXABAYES_DOC directory.

ExaBayes is a tool for Bayesian phylogenetic analyses. It implements a Markov chain Monte Carlo sampling approach that allows to determine the posterior probability of a tree (resp., topology) and various evolutionary model parameters, for instance, branch lengths or substitution rates. Similar approaches are implemented in BEAST [2] or MrBayes [1]. ExaBayes has heavily drawn inspiration specifically from the latter one.

ExaBayes comes with the most commonly used evolutionary models, such as the generalized time reversible model (GTR) of character substitution, the discretized Γ model of among site rate heterogeneity and estimates trees with unconstrained branch lengths. For clocked tree models or less parameter-rich substitution models, we refer you to the established tools.

The distinguishing feature of ExaBayes is its capability to handle enormous datasets efficiently. ExaBayes provides an implementation of data parallelism using the Message Passing Interface (MPI). This means, that if you conduct your analysis on a computing cluster composed of several machines (a.k.a. nodes), the memory needed to evaluate the likelihood of trees and parameters given a large alignment can be spread out across multiple computing nodes. In conclusion, the size of the concatenated alignment ExaBayes can handle is only limited by the combined main memory of your entire computing cluster.

Aside from that ExaBayes also implements chain-level and run-level parallelism, techniques to trade runtime for reduced memory footprint, a subtree equality vector approach that reduces memory without loss of runtime, a native AVX implementation for evaluating likelihood and parsimony scores (i.e., ExaBayes makes full use of your cutting-edge CPU), techniques to efficiently handle an arbitrary number of partitions. We use the highly efficient parsimony and likelihood implementation of RAxML [4]. Many of the techniques described above are adapted from or inspired by our experiences with large-scale maximum likelihood inferences using RAxML-Light/ExaML [5,6].

The ExaBayes package contains all tools necessary for post-processing your sampled chains. For visualization of parameter distributions, we recommend Tracer and FigTree (for which ExaBayes parameter files are compatible).

Required Modules

Serial

  • gcc/4.7.2
  • exabayes

Parallel (MPI)

  • gcc/4.7.2
  • openmpi/1.6.5
  • exabayes

System Variables

  • HPC_EXABAYES_DIR - installation directory
  • HPC_EXABAYES_BIN - executable directory
  • HPC_EXABAYES_DOC - documentation directory




Citation

If you publish research that uses exabayes you have to cite it as follows:

  1. Fredrik Ronquist, Maxim Teslenko, Paul van der Mark, Daniel L Ayres, Aaron Darling, Sebastian Höhna, Bret Larget, Liang Liu, Marc a Suchard, and John P Huelsenbeck. MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Systematic biology, 61(3):539-42, May 2012. [bib | DOI | http ]
  2. Alexei J Drummond, Marc a Suchard, Dong Xie, and Andrew Rambaut. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Molecular biology and evolution, 29(8):1969-73, August 2012. [ bib | DOI | http ]
  3. Clemens Lakner, Paul van der Mark, John P Huelsenbeck, Bret Larget, and Fredrik Ronquist. Efficiency of Markov chain Monte Carlo tree proposals in Bayesian phylogenetics. Systematic biology, 57(1):86-103, February 2008. [ bib | DOI | http ]
  4. Alexandros Stamatakis. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics, 22(21):2688-2690, November 2006. [ bib | DOI | http ]
  5. Alexandros Stamatakis, Andre J Aberer, Christian Goll, Stephen A Smith, Simon A Berger, and Fernando Izquierdo-Carrasco. RAxML-Light: a tool for computing terabyte phylogenies. Bioinformatics (Oxford, England), 28(15):2064-6, August 2012. [ bib | DOI | http ]