AlphaFold: Difference between revisions

From UFRC
Jump to navigation Jump to search
No edit summary
No edit summary
 
(12 intermediate revisions by 3 users not shown)
Line 5: Line 5:
<!--CONFIGURATION: OPTIONAL (|1}} means it's ON)-->
<!--CONFIGURATION: OPTIONAL (|1}} means it's ON)-->
|{{#vardefine:conf|}}          <!--CONFIGURATION-->
|{{#vardefine:conf|}}          <!--CONFIGURATION-->
|{{#vardefine:exe|}}            <!--ADDITIONAL INFO-->
|{{#vardefine:exe|1}}            <!--ADDITIONAL INFO-->
|{{#vardefine:job|}}            <!--JOB SCRIPTS-->
|{{#vardefine:job|}}            <!--JOB SCRIPTS-->
|{{#vardefine:policy|1}}        <!--POLICY-->
|{{#vardefine:policy|1}}        <!--POLICY-->
Line 18: Line 18:
{{App_Description|app={{#var:app}}|url={{#var:url}}|name={{#var:app}}}}|}}
{{App_Description|app={{#var:app}}|url={{#var:url}}|name={{#var:app}}}}|}}


This package provides an implementation of the inference pipeline of AlphaFold v2.0. This is a completely new model that was entered in CASP14 and published in Nature. For simplicity, we refer to this model as AlphaFold throughout the rest of this document.
This package provides an implementation of the inference pipeline of AlphaFold v2.3. This is a completely new model that was entered in CASP14 and published in Nature. For simplicity, we refer to this model as AlphaFold throughout the rest of this document.


<!--Modules-->
<!--Modules-->
==Environment Modules==
==Environment Modules==
Run <code>module spider {{#var:app}}</code> to find out what environment modules are available for this application.
Run <code>module spider {{#var:app}}</code> to find out what environment modules are available for this application.
==System Variables==
* HPC_{{uc:{{#var:app}}}}_DIR - installation directory
* HPC_{{uc:{{#var:app}}}}_BIN - executable directory


<!--Configuration-->
<!--Configuration-->
Line 34: Line 31:
{{#if: {{#var: exe}}|==Additional Information==
{{#if: {{#var: exe}}|==Additional Information==


WRITE_ADDITIONAL_INSTRUCTIONS_ON_RUNNING_THE_SOFTWARE_IF_NECESSARY
Note that Alphafold has large memory requirements and some of its stages use 4 or 8 CPUs in addition to a GPU. An example job script for a run with the test data included with the software is shown below.
<div class="mw-collapsible mw-collapsed" style="width:70%; padding: 5px; border: 1px solid gray;">
''Expand this section to view sample script for version 2.1.2.''
<div class="mw-collapsible-content" style="padding: 5px;">
<pre>
#!/bin/bash
#SBATCH --partition=gpu
#SBATCH --constraint=ai
#SBATCH --nodes=1
#SBATCH --ntasks=8
#SBATCH --gpus=1
#SBATCH --mem=48gb
#SBATCH --time=12:00:00
date;hostname;pwd


run_alphafold.py \
    --data_dir "${HPC_ALPHAFOLD_REF}" \
    --output_dir $(pwd) \
    --fasta_paths query.fasta \
    --uniref90_database_path=${HPC_ALPHAFOLD_REF}/uniref90/uniref90.fasta \
    --mgnify_database_path=${HPC_ALPHAFOLD_REF}/mgnify/mgy_clusters_2018_12.fa \
    --template_mmcif_dir=${HPC_ALPHAFOLD_REF}/pdb_mmcif/mmcif_files \
    --max_template_date=2020-05-14 \
    --obsolete_pdbs_path=${HPC_ALPHAFOLD_REF}/pdb_mmcif/obsolete.dat \
    --use_gpu_relax=1 \
    --bfd_database_path=${HPC_ALPHAFOLD_REF}/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \
    --uniclust30_database_path=${HPC_ALPHAFOLD_REF}/uniclust30/uniclust30_2018_08/uniclust30_2018_08 \
    --pdb70_database_path=${HPC_ALPHAFOLD_REF}/pdb70/pdb70
date
</pre>
</div>
</div>
<div class="mw-collapsible mw-collapsed" style="width:70%; padding: 5px; border: 1px solid gray;">
''Expand this section to view sample script for version 2.3.1.''
<div class="mw-collapsible-content" style="padding: 5px;">
<pre>
#!/bin/bash
#SBATCH --partition=gpu
#SBATCH --constraint=a100
#SBATCH --nodes=1
#SBATCH --ntasks=8
#SBATCH --gpus=1
#SBATCH --mem=300gb
#SBATCH --time=96:00:00
date;hostname;pwd
module load alphafold
alphafold_full_db.sh  --fasta_paths=${HPC_ALPHAFOLD_REF}/test.fasta --output_dir=~/scratch --max_template_date=2020-05-14 --use_gpu_relax=1
date
</pre>
</div>
</div>
|}}
|}}
<!--Job Scripts-->
<!--Job Scripts-->
Line 44: Line 93:
{{#if: {{#var: policy}}|==Usage Example==
{{#if: {{#var: policy}}|==Usage Example==


To simplify the usage use the 'run_alphafold.sh' script. Simple run example:
To simplify the usage use the 'alphafold_full_db.sh' script. Simple run example:


  run_alphafold.sh -o test/ -m model_1 -f query.fasta -t 2020-05-14
  alphafold_full_db.sh --fasta_paths=${HPC_ALPHAFOLD_REF}/test.fasta --output_dir=~/scratch --max_template_date=2020-05-14 --use_gpu_relax=1


Use the
From version 2.3, the AlphaFold documentation recommends running as Docker container.  However, Docker is not compatible with the HPC.  AlphaFold has been installed as an apptainer container and
  -d $HPC_ALPHAFOLD_REF  
the alphafold_full_db.sh wrapper script has been created to mimic the behavior of docker/run_docker.py as referenced in the AlphaFold documentation.  alphafold_full_db,sh will specify the database location options  required by alphafold. 
argument if you have a custom reference data directory.
 
To specify these options manually, use run_alphafold.sh instead.   
 
If using the --model_preset=multimer option, use the alphafold_multimer_db.sh launch script instead.  Example:
 
alphafold_multimer_db.sh --model_preset=multimer --fasta_paths=${HPC_ALPHAFOLD_REF}/test.fasta --output_dir=~/scratch --max_template_date=2020-05-14 --use_gpu_relax=1


|}}
|}}
Line 65: Line 119:
{{#if: {{#var: citation}}|==Citation==
{{#if: {{#var: citation}}|==Citation==
If you publish research that uses {{#var:app}} you have to cite it as follows:
If you publish research that uses {{#var:app}} you have to cite it as follows:
 
<div class="mw-collapsible mw-collapsed" style="width:70%; padding: 5px; border: 1px solid gray;">
''Expand this section to view citation instructions.''
<div class="mw-collapsible-content" style="padding: 5px;">
  @Article{AlphaFold2021,
  @Article{AlphaFold2021,
   author  = Jumper, John and Evans, Richard and Pritzel, Alexander and Green, Tim and Figurnov, Michael and Ronneberger, Olaf  
   author  = Jumper, John and Evans, Richard and Pritzel, Alexander and Green, Tim and Figurnov, Michael and Ronneberger, Olaf  
Line 80: Line 136:


  https://www.nature.com/articles/s41586-021-03819-2
  https://www.nature.com/articles/s41586-021-03819-2
 
</div>
 
</div>


<!--Installation-->
<!--Installation-->

Latest revision as of 15:38, 2 February 2024

Description

alphafold website  

This package provides an implementation of the inference pipeline of AlphaFold v2.3. This is a completely new model that was entered in CASP14 and published in Nature. For simplicity, we refer to this model as AlphaFold throughout the rest of this document.

Environment Modules

Run module spider alphafold to find out what environment modules are available for this application.


Additional Information

Note that Alphafold has large memory requirements and some of its stages use 4 or 8 CPUs in addition to a GPU. An example job script for a run with the test data included with the software is shown below.

Expand this section to view sample script for version 2.1.2.

Expand this section to view sample script for version 2.3.1.

Usage Example

To simplify the usage use the 'alphafold_full_db.sh' script. Simple run example:

alphafold_full_db.sh  --fasta_paths=${HPC_ALPHAFOLD_REF}/test.fasta --output_dir=~/scratch --max_template_date=2020-05-14 --use_gpu_relax=1

From version 2.3, the AlphaFold documentation recommends running as Docker container. However, Docker is not compatible with the HPC. AlphaFold has been installed as an apptainer container and the alphafold_full_db.sh wrapper script has been created to mimic the behavior of docker/run_docker.py as referenced in the AlphaFold documentation. alphafold_full_db,sh will specify the database location options required by alphafold.

To specify these options manually, use run_alphafold.sh instead.

If using the --model_preset=multimer option, use the alphafold_multimer_db.sh launch script instead. Example:

alphafold_multimer_db.sh  --model_preset=multimer --fasta_paths=${HPC_ALPHAFOLD_REF}/test.fasta --output_dir=~/scratch --max_template_date=2020-05-14 --use_gpu_relax=1


Citation

If you publish research that uses alphafold you have to cite it as follows:

Expand this section to view citation instructions.