Difference between revisions of "Sate"
(6 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | [[Category:Software]][[Category: | + | [[Category:Software]][[Category:biology]][[Category:phylogenetics]] |
{|<!--CONFIGURATION: REQUIRED--> | {|<!--CONFIGURATION: REQUIRED--> | ||
|{{#vardefine:app|sate}} | |{{#vardefine:app|sate}} | ||
Line 5: | Line 5: | ||
<!--CONFIGURATION: OPTIONAL (|1}} means it's ON)--> | <!--CONFIGURATION: OPTIONAL (|1}} means it's ON)--> | ||
|{{#vardefine:conf|}} <!--CONFIGURATION--> | |{{#vardefine:conf|}} <!--CONFIGURATION--> | ||
− | |{{#vardefine:exe|}} <!--ADDITIONAL INFO--> | + | |{{#vardefine:exe|1}} <!--ADDITIONAL INFO--> |
|{{#vardefine:job|}} <!--JOB SCRIPTS--> | |{{#vardefine:job|}} <!--JOB SCRIPTS--> | ||
|{{#vardefine:policy|}} <!--POLICY--> | |{{#vardefine:policy|}} <!--POLICY--> | ||
Line 18: | Line 18: | ||
{{App_Description|app={{#var:app}}|url={{#var:url}}|name={{#var:app}}}}|}} | {{App_Description|app={{#var:app}}|url={{#var:url}}|name={{#var:app}}}}|}} | ||
− | SATé is a software package for inferring a sequence alignment and phylogenetic tree. The iterative algorithm involves repeated alignment and tree searching operations. The original data set is divided into smaller subproblems by a tree-based decomposition. These subproblems are aligned and further merged for phylogenetic tree inference. | + | SATé is a software package for inferring a sequence alignment and phylogenetic tree. The iterative algorithm involves repeated alignment and tree searching operations. The original data set is divided into smaller subproblems by a tree-based decomposition. These subproblems are aligned and further merged for phylogenetic tree inference. For more information, please refer to the recent publication of Liu et al. |
+ | The implementation developed in University of Kansas is written by Jiaye Yu, Mark Holder, Jeet Sukumaran, and Siavash Mirarab. By default, this implementation uses the "SATe-II fast" settings. The primary difference is the use of recursive CT-1 instead of CT-5 decomposition described in the original Liu et al. paper. | ||
+ | |||
+ | The alignment and tree searching routines are implemented by calling "external" programs not written by us (but are bundled with the SATé distribution). | ||
+ | |||
+ | Currently, the following tools are supported, and are bundled with the SATe distribution: | ||
+ | *ClustalW 2.0.12 | ||
+ | *MAFFT 6.717 | ||
+ | *MUSCLE 3.7 | ||
+ | *OPAL 1.0.3 | ||
+ | *PRANK 100311 | ||
+ | *RAxML 7.2.6 | ||
+ | *FastTree 2.1.4 | ||
<!--Modules--> | <!--Modules--> | ||
==Environment Modules== | ==Environment Modules== | ||
Run <code>module spider {{#var:app}}</code> to find out what environment modules are available for this application. | Run <code>module spider {{#var:app}}</code> to find out what environment modules are available for this application. | ||
==System Variables== | ==System Variables== | ||
− | * HPC_{{ | + | * HPC_{{uc:{{#var:app}}}}_DIR - installation directory |
− | * HPC_{{ | + | * HPC_{{uc:{{#var:app}}}}_BIN - executable directory |
<!--Configuration--> | <!--Configuration--> | ||
{{#if: {{#var: conf}}|==Configuration== | {{#if: {{#var: conf}}|==Configuration== | ||
Line 32: | Line 44: | ||
<!--Run--> | <!--Run--> | ||
{{#if: {{#var: exe}}|==Additional Information== | {{#if: {{#var: exe}}|==Additional Information== | ||
+ | '''Note:''' By default, SATe uses your home directory as the location for temporary files. It is a violation of HPC policy for jobs to write to your home directory. | ||
+ | It is critical that you include the '''--temporaries=''' flag in your SATe command line to provide an alternative path for the temp files. PBS provides the $TMPDIR variable for you, and this is an excellent option. See example submission script. Another convenient variable you could use is $PBS_O_WORKDIR, something like --temporaries=$PBS_O_WORKDIR/temp would work well too. | ||
− | + | For all command line options, run: | |
− | + | run_sate.py -h | |
|}} | |}} | ||
<!--Job Scripts--> | <!--Job Scripts--> | ||
Line 42: | Line 56: | ||
<!--Policy--> | <!--Policy--> | ||
{{#if: {{#var: policy}}|==Usage Policy== | {{#if: {{#var: policy}}|==Usage Policy== | ||
− | |||
WRITE USAGE POLICY HERE (Licensing, usage, access). | WRITE USAGE POLICY HERE (Licensing, usage, access). | ||
− | |||
|}} | |}} | ||
<!--Performance--> | <!--Performance--> | ||
{{#if: {{#var: testing}}|==Performance== | {{#if: {{#var: testing}}|==Performance== | ||
− | |||
WRITE_PERFORMANCE_TESTING_RESULTS_HERE | WRITE_PERFORMANCE_TESTING_RESULTS_HERE | ||
− | |||
|}} | |}} | ||
<!--Faq--> | <!--Faq--> | ||
Line 57: | Line 67: | ||
<!--Citation--> | <!--Citation--> | ||
{{#if: {{#var: citation}}|==Citation== | {{#if: {{#var: citation}}|==Citation== | ||
− | |||
If you use the software in a publication, please cite the software, the papers describing the method, and the appropriate citation for the external tools. | If you use the software in a publication, please cite the software, the papers describing the method, and the appropriate citation for the external tools. | ||
+ | <div class="mw-collapsible mw-collapsed" style="width:70%; padding: 5px; border: 1px solid gray;"> | ||
+ | ''Expand this section to view citation instructions.'' | ||
+ | <div class="mw-collapsible-content" style="padding: 5px;"> | ||
Algorithm citations | Algorithm citations | ||
* Liu, K., S. Raghavan, S. Nelesen, C. R. Linder, T. Warnow, 2009. "Rapid and accurate large scale coestimation of sequence alignments and phylogenetic trees." Science, 324(5934), pp. 1561-1564, 19 June 2009, doi: 10.1126/science.1171243 | * Liu, K., S. Raghavan, S. Nelesen, C. R. Linder, T. Warnow, 2009. "Rapid and accurate large scale coestimation of sequence alignments and phylogenetic trees." Science, 324(5934), pp. 1561-1564, 19 June 2009, doi: 10.1126/science.1171243 | ||
Line 69: | Line 81: | ||
* Sukumaran, J. and Mark T. Holder. 2010. "DendroPy: A Python library for phylogenetic computing". Bioinformatics 26: 1569-1571. (for all SATé versions from this website) | * Sukumaran, J. and Mark T. Holder. 2010. "DendroPy: A Python library for phylogenetic computing". Bioinformatics 26: 1569-1571. (for all SATé versions from this website) | ||
+ | ===External tool citations=== | ||
+ | Please remember to cite the aligner and tree inference tools that you use during the course of a SATé run. The exact citation will depend on what tools you choose to use: | ||
+ | * Mafft: See the References section on http://mafft.cbrc.jp/alignment/software/ | ||
+ | * RAxML: See the Publications section on http://wwwkramer.in.tum.de/exelixis/publications.html | ||
+ | * Opal: Wheeler, T.J. and Kececioglu, J.D. Multiple alignment by aligning alignments. ''Proceedings of the 15th ISCB Conference on Intelligent Systems for Molecular Biology, Bioinformatics'' '''23''', i559-i568, 2007. And see http://opal.cs.arizona.edu/ | ||
+ | * Muscle: Edgar, R.C. (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. ''Nucleic Acids Res.'' '''32(5):'''1792-1797. doi:10.1093/nar/gkh340. Edgar, R.C. (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. ''BMC Bioinformatics'', '''(5)''' 113. doi:10.1186/1471-2105-5-113. See http://www.drive5.com/muscle/ | ||
+ | * Clustal: See the References section of ftp://ftp.ebi.ac.uk/pub/software/clustalw2/clustalx_help.html | ||
+ | * Prank: See http://www.ebi.ac.uk/goldman-srv/prank/prank | ||
+ | * FastTree: Price MN, Dehal PS, Arkin AP. (2010) FastTree 2: Approximately Maximum-Likelihood Trees for Large Alignments. ''PLoS ONE'' '''5(3)''': e9490. doi:10.1371/journal.pone.0009490. | ||
+ | </div> | ||
+ | </div> | ||
|}} | |}} | ||
<!--Installation--> | <!--Installation--> |
Latest revision as of 20:43, 21 December 2022
Description
SATé is a software package for inferring a sequence alignment and phylogenetic tree. The iterative algorithm involves repeated alignment and tree searching operations. The original data set is divided into smaller subproblems by a tree-based decomposition. These subproblems are aligned and further merged for phylogenetic tree inference. For more information, please refer to the recent publication of Liu et al.
The implementation developed in University of Kansas is written by Jiaye Yu, Mark Holder, Jeet Sukumaran, and Siavash Mirarab. By default, this implementation uses the "SATe-II fast" settings. The primary difference is the use of recursive CT-1 instead of CT-5 decomposition described in the original Liu et al. paper.
The alignment and tree searching routines are implemented by calling "external" programs not written by us (but are bundled with the SATé distribution).
Currently, the following tools are supported, and are bundled with the SATe distribution:
- ClustalW 2.0.12
- MAFFT 6.717
- MUSCLE 3.7
- OPAL 1.0.3
- PRANK 100311
- RAxML 7.2.6
- FastTree 2.1.4
Environment Modules
Run module spider sate
to find out what environment modules are available for this application.
System Variables
- HPC_SATE_DIR - installation directory
- HPC_SATE_BIN - executable directory
Additional Information
Note: By default, SATe uses your home directory as the location for temporary files. It is a violation of HPC policy for jobs to write to your home directory. It is critical that you include the --temporaries= flag in your SATe command line to provide an alternative path for the temp files. PBS provides the $TMPDIR variable for you, and this is an excellent option. See example submission script. Another convenient variable you could use is $PBS_O_WORKDIR, something like --temporaries=$PBS_O_WORKDIR/temp would work well too.
For all command line options, run:
run_sate.py -h
Citation
If you use the software in a publication, please cite the software, the papers describing the method, and the appropriate citation for the external tools.
Expand this section to view citation instructions.
Algorithm citations
- Liu, K., S. Raghavan, S. Nelesen, C. R. Linder, T. Warnow, 2009. "Rapid and accurate large scale coestimation of sequence alignments and phylogenetic trees." Science, 324(5934), pp. 1561-1564, 19 June 2009, doi: 10.1126/science.1171243
- Liu, K., T.J. Warnow, M.T. Holder, S. Nelesen, J. Yu, A. Stamatakis, and C.R. Linder. "SATé-II: Very Fast and Accurate Simultaneous Estimation of Multiple Sequence Alignments and Phylogenetic Trees." Systematic Biology. 61(1):90-106
Citations for the SATé software itself and its dependencies
- Jiaye Yu, and Mark T. Holder "SATé version VERSION_NUMBER_HERE" from http://phylo.bio.ku.edu/software/sate/sate.html DATE DOWNLOADED." (for version 1.2 or earlier)
- Jiaye Yu, Mark T. Holder, Jeet Sukumaran, and Siavash Mirarab "SATé version VERSION_NUMBER_HERE" from http://phylo.bio.ku.edu/software/sate/sate.html DATE DOWNLOADED." (for version 1.2.1 to 2.1.0)
- Jiaye Yu, Mark T. Holder, Jeet Sukumaran, Siavash Mirarab, and Jamie Oaks "SATé version VERSION_NUMBER_HERE" from http://phylo.bio.ku.edu/software/sate/sate.html DATE DOWNLOADED." (for version 2.2.0 or later)
- Sukumaran, J. and Mark T. Holder. 2010. "DendroPy: A Python library for phylogenetic computing". Bioinformatics 26: 1569-1571. (for all SATé versions from this website)
External tool citations
Please remember to cite the aligner and tree inference tools that you use during the course of a SATé run. The exact citation will depend on what tools you choose to use:
- Mafft: See the References section on http://mafft.cbrc.jp/alignment/software/
- RAxML: See the Publications section on http://wwwkramer.in.tum.de/exelixis/publications.html
- Opal: Wheeler, T.J. and Kececioglu, J.D. Multiple alignment by aligning alignments. Proceedings of the 15th ISCB Conference on Intelligent Systems for Molecular Biology, Bioinformatics 23, i559-i568, 2007. And see http://opal.cs.arizona.edu/
- Muscle: Edgar, R.C. (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32(5):1792-1797. doi:10.1093/nar/gkh340. Edgar, R.C. (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics, (5) 113. doi:10.1186/1471-2105-5-113. See http://www.drive5.com/muscle/
- Clustal: See the References section of ftp://ftp.ebi.ac.uk/pub/software/clustalw2/clustalx_help.html
- Prank: See http://www.ebi.ac.uk/goldman-srv/prank/prank
- FastTree: Price MN, Dehal PS, Arkin AP. (2010) FastTree 2: Approximately Maximum-Likelihood Trees for Large Alignments. PLoS ONE 5(3): e9490. doi:10.1371/journal.pone.0009490.