MMseqs2

From UFRC
Jump to navigation Jump to search

Description

mmseqs2 website  

MMseqs2 (Many-against-Many sequence searching) is a software suite to search and cluster huge protein and nucleotide sequence sets. MMseqs2 is open source GPL-licensed software implemented in C++ for Linux, MacOS, and (as beta version, via cygwin) Windows. The software is designed to run on multiple cores and servers and exhibits very good scalability. MMseqs2 can run 10000 times faster than BLAST. At 100 times its speed it achieves almost the same sensitivity. It can perform profile searches with the same sensitivity as PSI-BLAST at over 400 times its speed.


Environment Modules

Run module spider mmseqs2 to find out what environment modules are available for this application.

System Variables

  • HPC_MMSEQS2_DIR - installation directory
  • HPC_MMSEQS2_BIN - executable directory
  • HPC_MMSEQS2_DOC - documentation directory
  • HPC_MMSEQS2_EXE - example directory


Additional Information

The databases module in mmseqs provides download of protein, nucleotide, and profile databases. For a list of curated datasets available, load the module and use the following command: $ mmseqs databases

For your convenience, the following databases are hosted by HPG and available at the following paths:

  • /data/reference/mmseqs/14/SILVA
  • /data/reference/mmseqs/14/uniref50
  • /data/reference/mmseqs/14/GTDB



Citation

If you publish research that uses mmseqs2 you have to cite it as follows:

Mirdita M, Steinegger M and Soeding J. MMseqs2 desktop and local web server app for fast, interactive sequence searches. Bioinformatics, doi: 10.1093/bioinformatics/bty1057 (2019).