Sate

From UFRC
Revision as of 20:34, 12 August 2022 by Israel.herrera (talk | contribs)
Jump to navigation Jump to search

Description

sate website  

SATé is a software package for inferring a sequence alignment and phylogenetic tree. The iterative algorithm involves repeated alignment and tree searching operations. The original data set is divided into smaller subproblems by a tree-based decomposition. These subproblems are aligned and further merged for phylogenetic tree inference. For more information, please refer to the recent publication of Liu et al.

The implementation developed in University of Kansas is written by Jiaye Yu, Mark Holder, Jeet Sukumaran, and Siavash Mirarab. By default, this implementation uses the "SATe-II fast" settings. The primary difference is the use of recursive CT-1 instead of CT-5 decomposition described in the original Liu et al. paper.

The alignment and tree searching routines are implemented by calling "external" programs not written by us (but are bundled with the SATé distribution).

Currently, the following tools are supported, and are bundled with the SATe distribution:

  • ClustalW 2.0.12
  • MAFFT 6.717
  • MUSCLE 3.7
  • OPAL 1.0.3
  • PRANK 100311
  • RAxML 7.2.6
  • FastTree 2.1.4

Environment Modules

Run module spider sate to find out what environment modules are available for this application.

System Variables

  • HPC_SATE_DIR - installation directory
  • HPC_SATE_BIN - executable directory

Additional Information

Note: By default, SATe uses your home directory as the location for temporary files. It is a violation of HPC policy for jobs to write to your home directory. It is critical that you include the --temporaries= flag in your SATe command line to provide an alternative path for the temp files. PBS provides the $TMPDIR variable for you, and this is an excellent option. See example submission script. Another convenient variable you could use is $PBS_O_WORKDIR, something like --temporaries=$PBS_O_WORKDIR/temp would work well too.

For all command line options, run:

run_sate.py -h



Citation

If you use the software in a publication, please cite the software, the papers describing the method, and the appropriate citation for the external tools. Algorithm citations

  • Liu, K., S. Raghavan, S. Nelesen, C. R. Linder, T. Warnow, 2009. "Rapid and accurate large scale coestimation of sequence alignments and phylogenetic trees." Science, 324(5934), pp. 1561-1564, 19 June 2009, doi: 10.1126/science.1171243
  • Liu, K., T.J. Warnow, M.T. Holder, S. Nelesen, J. Yu, A. Stamatakis, and C.R. Linder. "SATé-II: Very Fast and Accurate Simultaneous Estimation of Multiple Sequence Alignments and Phylogenetic Trees." Systematic Biology. 61(1):90-106

Citations for the SATé software itself and its dependencies

  • Jiaye Yu, and Mark T. Holder "SATé version VERSION_NUMBER_HERE" from http://phylo.bio.ku.edu/software/sate/sate.html DATE DOWNLOADED." (for version 1.2 or earlier)
  • Jiaye Yu, Mark T. Holder, Jeet Sukumaran, and Siavash Mirarab "SATé version VERSION_NUMBER_HERE" from http://phylo.bio.ku.edu/software/sate/sate.html DATE DOWNLOADED." (for version 1.2.1 to 2.1.0)
  • Jiaye Yu, Mark T. Holder, Jeet Sukumaran, Siavash Mirarab, and Jamie Oaks "SATé version VERSION_NUMBER_HERE" from http://phylo.bio.ku.edu/software/sate/sate.html DATE DOWNLOADED." (for version 2.2.0 or later)
  • Sukumaran, J. and Mark T. Holder. 2010. "DendroPy: A Python library for phylogenetic computing". Bioinformatics 26: 1569-1571. (for all SATé versions from this website)

External tool citations

Please remember to cite the aligner and tree inference tools that you use during the course of a SATé run. The exact citation will depend on what tools you choose to use: