Difference between revisions of "BUSCO"

Latest revision as of 19:38, 27 March 2023

Description

BUSCO stands for Assessing genome assembly and annotation completeness with Benchmarking Universal Single-Copy Orthologs

Environment Modules

Run module spider busco to find out what environment modules are available for this application.

System Variables

HPC_BUSCO_DIR - installation directory

Additional Information

Busco uses a config file which needs to be copied and modified to your needs.

$ cp $HPC_BUSCO_CONF/config.ini .
$ export BUSCO_CONFIG_FILE=$(pwd)/config.ini
$ busco -f -i ... <other arguments>

If you don't need to modify the config file you can use the installed copy:

$ busco -f --config ${HPC_BUSCO_CONF}/config.ini -i ... <other arguments>

Mandatory arguments

-i or --in defines the input file to analyse which is either a nucleotide fasta file or a protein fasta file, depending on the BUSCO mode. As of v5.1.0 the input argument can now also be a directory containing fasta files to run in batch mode.
-o or --out defines the folder that will contain all results, logs, and intermediate data
-m or --mode sets the assessment MODE: genome, proteins, transcriptome
-l or --lineage_dataset

Datasets are located in /data/reference/busco/VERSION. The config.ini file is already configured to use the correct path.

Available datasets:

arthropoda
bacteria
eukaryota
fungi
metazoa
vertebrata

and many more. Run the following command to see all available species:

$ ls /data/reference/busco/VERSION
#for example: $ ls /data/reference/busco/v5

Example of busco run with metazoa dataset:

busco -f -in target.fa -o SAMPLE -l ${HPC_BUSCO_DAT}/metazoa -m genome

To allow busco to retrain an existing Augustus dataset create a local copy of the data and set $AUGUSTUS_CONFIG_PATH variable to that path as explained on the Augustus page.

Expand this section to view an example, copying aspergillus nidulans.

Let's copy aspergillus_nidulans

mkdir -p augustus/species

Load the busco module and copy augustus data

cp $AUGUSTUS_CONFIG_PATH/species/aspergillus_nidulans/ augustus/species/

Copy the models

 cp $AUGUSTUS_CONFIG_PATH/model augustus/

Add this to the busco job script and submit it.
- ```
export AUGUSTUS_CONFIG_PATH=$(pwd)/augustus
```

Citation

If you publish research that uses busco you have to cite it as follows:

BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Felipe A. Simão, Robert M. Waterhouse, Panagiotis Ioannidis, Evgenia V. Kriventseva, and Evgeny M. Zdobnov Bioinformatics, published online June 9, 2015

@@ Line 18: / Line 18: @@
 {{App_Description|app={{#var:app}}|url={{#var:url}}|name={{#var:app}}}}|}}
-Assessing genome assembly and annotation completeness with Benchmarking Universal Single-Copy Orthologs
+BUSCO stands for Assessing genome assembly and annotation completeness with Benchmarking Universal Single-Copy Orthologs
 <!--Modules-->
-==Required Modules==
+==Environment Modules==
+Run <code>module spider {{#var:app}}</code> to find out what environment modules are available for this application.
-===Serial===
-* {{#var:app}}
-<!--
-===Parallel (OpenMP)===
-* intel
-* {{#var:app}}
-===Parallel (MPI)===
-* intel
-* openmpi
-* {{#var:app}}
--->
 ==System Variables==
-* HPC_{{#uppercase:{{#var:app}}}}_DIR - installation directory
+* HPC_{{uc:{{#var:app}}}}_DIR - installation directory
 <!--Configuration-->
 {{#if: {{#var: conf}}|==Configuration==
@@ Line 45: / Line 35: @@
 Busco uses a config file which needs to be copied and modified to your needs.
-  $ cp $HPC_BUSCO_CONF/config.ini /home/username/busco
+  $ cp $HPC_BUSCO_CONF/config.ini .
-  $ export BUSCO_CONFIG_FILE=/home/username/busco
+  $ export BUSCO_CONFIG_FILE=$(pwd)/config.ini
-  $ run_BUSCO.py
+  $ busco -f -i ... <other arguments>
+If you don't need to modify the config file you can use the installed copy:
+ $ busco -f --config ${HPC_BUSCO_CONF}/config.ini -i ... <other arguments>
-Datasets are located in /ufrc/data/reference/busco/
+[https://busco.ezlab.org/busco_userguide.html#mandatory-arguments Mandatory arguments]
+*-i or --in defines the input file to analyse which is either a nucleotide fasta file or a protein fasta file, depending on the BUSCO mode. As of v5.1.0 the input argument can now also be a directory containing fasta files to run in batch mode.
+*-o or --out defines the folder that will contain all results, logs, and intermediate data
+*-m or --mode sets the assessment MODE: genome, proteins, transcriptome
+*-l or --lineage_dataset
-Available datasets:
+Datasets are located in /data/reference/busco/VERSION. The config.ini file is already configured to use the correct path.
+'''Available datasets:'''
+<div style="column-count:3">
 *arthropoda
 *bacteria
@@ Line 60: / Line 59: @@
 *metazoa
 *vertebrata
+</div>
+and many more. Run the following command to see all available species:
+ $ ls /data/reference/busco/VERSION
+ #for example: $ ls /data/reference/busco/v5
 Example of busco run with metazoa dataset:
   busco -f -in target.fa -o SAMPLE -l ${HPC_BUSCO_DAT}/metazoa -m genome
-To allow busco to retrain the augustus dataset create a local augustus directory, set $AUGUSTUS_CONFIG_PATH variable to that path, and copy the dataset for the organism in question to your local directory.
+To allow busco to retrain an existing Augustus dataset create a local copy of the data and set $AUGUSTUS_CONFIG_PATH variable to that path as explained on the [[Augustus]] page.
-as explained on the [[Augustus]] page.
+<div class="mw-collapsible mw-collapsed" style="width:70%; padding: 5px; border: 1px solid gray;">
+''Expand this section to view an example, copying aspergillus nidulans.''
+<div class="mw-collapsible-content" style="padding: 5px;">
+Let's copy aspergillus_nidulans
+mkdir -p augustus/species
+# Load the busco module and copy augustus data
+#*<pre>cp $AUGUSTUS_CONFIG_PATH/species/aspergillus_nidulans/ augustus/species/</pre>
+# Copy the models
+#*<pre> cp $AUGUSTUS_CONFIG_PATH/model augustus/</pre>
+#Add this to the busco job script and submit it.
+#*<pre>export AUGUSTUS_CONFIG_PATH=$(pwd)/augustus</pre></div></div>
 |}}
 <!--PBS scripts-->