Difference between revisions of "OrthoMCL"
Moskalenko (talk | contribs) m (Text replace - "<!--Choose sections to enable - OPTIONAL--> |{{#vardefine:mod|1}} <!--Present instructions for running the software with modules -->" to "") |
Moskalenko (talk | contribs) m (Text replace - "|{{#vardefine:intel|}} <!-- E.g. "11.1" --> |{{#vardefine:mpi|}} <!-- E.g. "openmpi/1.3.4" -->" to "") |
||
Line 8: | Line 8: | ||
|{{#vardefine:url|http://orthomcl.org/}} | |{{#vardefine:url|http://orthomcl.org/}} | ||
<!--Compiler and MPI settings - OPTIONAL --> | <!--Compiler and MPI settings - OPTIONAL --> | ||
− | + | ||
− | |||
|{{#vardefine:exe|}} <!--Present manual instructions for running the software --> | |{{#vardefine:exe|}} <!--Present manual instructions for running the software --> |
Revision as of 17:53, 10 August 2012
|
Description
OrthoMCL is a genome-scale algorithm for grouping orthologous protein sequences. It provides not only groups shared by two or more species/genomes, but also groups representing species-specific gene expansion families. So it serves as an important utility for automated eukaryotic genome annotation. OrthoMCL starts with reciprocal best hits within each genome as potential in-paralog/recent paralog pairs and reciprocal best hits across any two genomes as potential ortholog pairs. Related proteins are interlinked in a similarity graph. Then MCL (Markov Clustering algorithm,Van Dongen 2000; www.micans.org/mcl) is invoked to split mega-clusters. This process is analogous to the manual review in COG construction. MCL clustering is based on weights between each pair of proteins, so to correct for differences in evolutionary distance the weights are normalized before running MCL.
Required Modules
Serial
- orthomcl
- HPC_ORTHOMCL_BIN - executable directory
- HPC_ORTHOMCL_CONF - configuration directory. It contains the orthomcl.config file that provides the MySQL connection settings, credentials, and the database structure.