CSUBST is a tool for analyzing Combinatorial SUBSTitutions of codon sequences in phylogenetic trees. A combinatorial substitution is defined as recurrent substitutions that occur at the same protein site in multiple independent branches. If multiple substitutions result in the same amino acid, they are considered convergent amino acid substitutions. The main features of CSUBST include:

  • Error-corrected rate of protein convergence with null expectation obtained by:
 1) Empirical or mechanistic codon substitution model
 2) Urn sampling from site-wise substitution frequencies (experimental)
  • Flexible specification of "foreground" lineages and its comparison with neighboring branches
  • Heuristic detection of higher-order convergence involving more than two branches
  • Simulated sequence evolution under specified scenarios of convergent evolution
  • Convergent substitution mapping to protein structure

Environment Modules

Run module spider csubst to find out what environment modules are available for this application.

System Variables

  • HPC_CSUBST_DIR - installation directory
  • HPC_CSUBST_BIN - installation directory


If you publish research that uses csubst you have to cite it as follows:

Fukushima K, Pollock DD. 2023. Detecting macroevolutionary genotype-phenotype associations using error-corrected rates of protein convergence. Nature Ecology & Evolution 7: 155–170. DOI: 10.1038/s41559-022-01932-7