MMseqs2
Description
MMseqs2 (Many-against-Many sequence searching) is a software suite to search and cluster huge protein and nucleotide sequence sets. MMseqs2 is open source GPL-licensed software implemented in C++ for Linux, MacOS, and (as beta version, via cygwin) Windows. The software is designed to run on multiple cores and servers and exhibits very good scalability. MMseqs2 can run 10000 times faster than BLAST. At 100 times its speed it achieves almost the same sensitivity. It can perform profile searches with the same sensitivity as PSI-BLAST at over 400 times its speed.
Environment Modules
Run module spider mmseqs2
to find out what environment modules are available for this application.
System Variables
- HPC_MMSEQS2_DIR - installation directory
- HPC_MMSEQS2_BIN - executable directory
- HPC_MMSEQS2_DOC - documentation directory
- HPC_MMSEQS2_EXE - example directory
Additional Information
The databases module in mmseqs provides download of protein, nucleotide, and profile databases. For a list of curated datasets available, load the module and use the following command: $ mmseqs databases
For your convenience, the following databases are hosted by HPG and available at the following paths:
- /data/reference/mmseqs/14/SILVA
- /data/reference/mmseqs/14/uniref50
- /data/reference/mmseqs/14/GTDB
Citation
If you publish research that uses mmseqs2 you have to cite it as follows: