Difference between revisions of "OrthoMCL"

From UFRC
Jump to navigation Jump to search
m (Text replace - "<!-- ######## Template Configuration ######## --> <!--Edit definitions of the variables used in template calls Required variables: app - lowercase name of the application e.g. "amber" url - url of the software page (project, company prod)
(18 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
__NOTOC__
 
__NOTOC__
 
__NOEDITSECTION__
 
__NOEDITSECTION__
[[Category:Software]][[Category:Bioinformatics]][[Category:Genomics]]
+
[[Category:Software]][[Category:Biology]][[Category:Genomics]]
 
+
{|<!--Main settings - REQUIRED-->
{|
 
<!--Main settings - REQUIRED-->
 
 
|{{#vardefine:app|orthomcl}}
 
|{{#vardefine:app|orthomcl}}
 
|{{#vardefine:url|http://orthomcl.org/}}
 
|{{#vardefine:url|http://orthomcl.org/}}
<!--Compiler and MPI settings - OPTIONAL -->
+
|{{#vardefine:exe|1}} <!--Present manual instructions for running the software -->
|{{#vardefine:intel|}} <!-- E.g. "11.1" -->
 
|{{#vardefine:mpi|}} <!-- E.g. "openmpi/1.3.4" -->
 
<!--Choose sections to enable - OPTIONAL-->
 
|{{#vardefine:mod|1}} <!--Present instructions for running the software with modules -->
 
|{{#vardefine:exe|}} <!--Present manual instructions for running the software -->
 
 
|{{#vardefine:conf|}} <!--Enable config wiki page link - {{#vardefine:conf|1}} = ON/conf|}} = OFF-->
 
|{{#vardefine:conf|}} <!--Enable config wiki page link - {{#vardefine:conf|1}} = ON/conf|}} = OFF-->
 
|{{#vardefine:pbs|}} <!--Enable PBS script wiki page link-->
 
|{{#vardefine:pbs|}} <!--Enable PBS script wiki page link-->
Line 18: Line 11:
 
|{{#vardefine:testing|}} <!--Enable performance testing/profiling section -->
 
|{{#vardefine:testing|}} <!--Enable performance testing/profiling section -->
 
|{{#vardefine:faq|}} <!--Enable FAQ section -->
 
|{{#vardefine:faq|}} <!--Enable FAQ section -->
|{{#vardefine:citation|}} <!--Enable Reference/Citation section -->
+
|{{#vardefine:citation|1}} <!--Enable Reference/Citation section -->
 
|}
 
|}
 
<!-- ########  Template Body ######## -->
 
<!-- ########  Template Body ######## -->
Line 28: Line 21:
  
 
<!--Modules-->
 
<!--Modules-->
==Required Modules==
+
==Environment Modules==
[[Modules|modules documentation]]
+
Run <code>module spider {{#var:app}}</code> to find out what environment modules are available for this application.
===Serial===
+
==System Variables==
*{{#var:app}}
+
* HPC_{{uc:{{#var:app}}}}_DIR - installation directory
 
* HPC_ORTHOMCL_BIN - executable directory
 
* HPC_ORTHOMCL_BIN - executable directory
 
* HPC_ORTHOMCL_CONF - configuration directory. It contains the orthomcl.config file that provides the MySQL connection settings, credentials, and the database structure.
 
* HPC_ORTHOMCL_CONF - configuration directory. It contains the orthomcl.config file that provides the MySQL connection settings, credentials, and the database structure.
 
{{#if: {{#var: exe}}|==How To Run==
 
{{#if: {{#var: exe}}|==How To Run==
WRITE INSTRUCTIONS ON RUNNING THE ACTUAL BINARY|}}
+
===Database Policy===
 +
See [[Database_Removal_Procedure|the Database Removal Procedure]] for details on how application databases are removed or retained by UFRC.
 +
 
 +
===Configuration===
 +
The '<code>$HPC_ORTHOMCL_CONF/orthomcl.config</code>' file provides a template for configuring an OrthoMCL database. Copy it to your working directory with
 +
$ cp $HPC_ORTHOMCL_CONF/orthomcl.config .
 +
and change the database name from 'orthomcl_jdoe_myproject:' to your own database that has a name with a pattern <code>'orthomcl_$USER_project_name'</code> in the string <code>'dbConnectString=dbi:mysql:'''orthomcl_jdoe_myproject:''':orthomcldb.ufhpc'</code>. MySQL server will allow the orthomcl user to create databases with that name pattern.
 +
 
 +
We also provide orthomcl database management scripts for your convenience. They include
 +
* hpc_list_orthomcl_databases
 +
* hpc_create_orthomcl_database
 +
* hpc_remove_orthomcl_database
 +
 
 +
Please triple-check the name of the database you are removing to make sure you only remove one of your databases, which you absolutely do not need anymore.
 +
 
 +
Don't forget to create the schema with
 +
orthomclInstallSchema orthomcl.config
 +
after creating a database.
 +
 
 +
OrthoMCL documentation is available from several sources.
 +
 
 +
* A User Guide and other documents in text format are located in /apps/orthomcl/VERSION/doc/OrthoMCLEngine/Main/ and at [http://orthomcl.org/common/downloads/software/v2.0/UserGuide.txt].
 +
 
 +
* An alternative User Guide can be found at [http://onlinelibrary.wiley.com/doi/10.1002/0471250953.bi0612s35/full Wiley Online Library].
 +
 
 +
* The original publications that describe the functionality of the software are [http://dx.doi.org/10.1101%2Fgr.1224503 10.1101/gr.1224503] and [http://dx.doi.org/10.1093%2Fnar%2Fgkj123 10.1093/nar/gkj123].
 +
|}}
 
{{#if: {{#var: conf}}|==Configuration==
 
{{#if: {{#var: conf}}|==Configuration==
 
See the [[{{PAGENAME}}_Configuration]] page for {{#var: app}} configuration details.|}}
 
See the [[{{PAGENAME}}_Configuration]] page for {{#var: app}} configuration details.|}}
Line 47: Line 66:
 
*'''Q:''' **'''A:'''|}}
 
*'''Q:''' **'''A:'''|}}
 
{{#if: {{#var: citation}}|==Citation==
 
{{#if: {{#var: citation}}|==Citation==
If you publish research that uses {{{app}}} you have to cite it as follows:
+
If you publish research that uses {{#var: app}} you have to cite it as follows:
WRITE CITATION HERE
+
#Feng Chen, Aaron J. Mackey, Christian J. Stoeckert, Jr., and David S. Roos. OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups. Nucleic Acids Res. 2006 34: D363-8. '''Please cite this paper if you publish research results benefited from OrthoMCL-DB.'''
 +
#Li Li, Christian J. Stoeckert, Jr., and David S. Roos.  OrthoMCL: Identification of Ortholog Groups for Eukaryotic Genomes. Genome Res. 2003 13: 2178-2189.
 +
#Feng Chen, Aaron J. Mackey, Jeroen K. Vermunt, and David S. Roos. Assessing Performance of Orthology Detection Strategies Applied to Eukaryotic Genomes. PLoS ONE 2007 2(4): e383.
 
|}}
 
|}}

Revision as of 20:12, 12 August 2022

Description

orthomcl website  

OrthoMCL is a genome-scale algorithm for grouping orthologous protein sequences. It provides not only groups shared by two or more species/genomes, but also groups representing species-specific gene expansion families. So it serves as an important utility for automated eukaryotic genome annotation. OrthoMCL starts with reciprocal best hits within each genome as potential in-paralog/recent paralog pairs and reciprocal best hits across any two genomes as potential ortholog pairs. Related proteins are interlinked in a similarity graph. Then MCL (Markov Clustering algorithm,Van Dongen 2000; www.micans.org/mcl) is invoked to split mega-clusters. This process is analogous to the manual review in COG construction. MCL clustering is based on weights between each pair of proteins, so to correct for differences in evolutionary distance the weights are normalized before running MCL.

Environment Modules

Run module spider orthomcl to find out what environment modules are available for this application.

System Variables

  • HPC_ORTHOMCL_DIR - installation directory
  • HPC_ORTHOMCL_BIN - executable directory
  • HPC_ORTHOMCL_CONF - configuration directory. It contains the orthomcl.config file that provides the MySQL connection settings, credentials, and the database structure.

How To Run

Database Policy

See the Database Removal Procedure for details on how application databases are removed or retained by UFRC.

Configuration

The '$HPC_ORTHOMCL_CONF/orthomcl.config' file provides a template for configuring an OrthoMCL database. Copy it to your working directory with

$ cp $HPC_ORTHOMCL_CONF/orthomcl.config .

and change the database name from 'orthomcl_jdoe_myproject:' to your own database that has a name with a pattern 'orthomcl_$USER_project_name' in the string 'dbConnectString=dbi:mysql:orthomcl_jdoe_myproject::orthomcldb.ufhpc'. MySQL server will allow the orthomcl user to create databases with that name pattern.

We also provide orthomcl database management scripts for your convenience. They include

  • hpc_list_orthomcl_databases
  • hpc_create_orthomcl_database
  • hpc_remove_orthomcl_database

Please triple-check the name of the database you are removing to make sure you only remove one of your databases, which you absolutely do not need anymore.

Don't forget to create the schema with

orthomclInstallSchema orthomcl.config

after creating a database.

OrthoMCL documentation is available from several sources.

  • A User Guide and other documents in text format are located in /apps/orthomcl/VERSION/doc/OrthoMCLEngine/Main/ and at [1].



Citation

If you publish research that uses orthomcl you have to cite it as follows:

  1. Feng Chen, Aaron J. Mackey, Christian J. Stoeckert, Jr., and David S. Roos. OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups. Nucleic Acids Res. 2006 34: D363-8. Please cite this paper if you publish research results benefited from OrthoMCL-DB.
  2. Li Li, Christian J. Stoeckert, Jr., and David S. Roos. OrthoMCL: Identification of Ortholog Groups for Eukaryotic Genomes. Genome Res. 2003 13: 2178-2189.
  3. Feng Chen, Aaron J. Mackey, Jeroen K. Vermunt, and David S. Roos. Assessing Performance of Orthology Detection Strategies Applied to Eukaryotic Genomes. PLoS ONE 2007 2(4): e383.