BUSCO: Difference between revisions
No edit summary |
No edit summary |
||
(14 intermediate revisions by 3 users not shown) | |||
Line 18: | Line 18: | ||
{{App_Description|app={{#var:app}}|url={{#var:url}}|name={{#var:app}}}}|}} | {{App_Description|app={{#var:app}}|url={{#var:url}}|name={{#var:app}}}}|}} | ||
Assessing genome assembly and annotation completeness with Benchmarking Universal Single-Copy Orthologs | BUSCO stands for Assessing genome assembly and annotation completeness with Benchmarking Universal Single-Copy Orthologs | ||
<!--Modules--> | <!--Modules--> | ||
== | ==Environment Modules== | ||
Run <code>module spider {{#var:app}}</code> to find out what environment modules are available for this application. | |||
==System Variables== | ==System Variables== | ||
* HPC_{{ | * HPC_{{uc:{{#var:app}}}}_DIR - installation directory | ||
<!--Configuration--> | <!--Configuration--> | ||
{{#if: {{#var: conf}}|==Configuration== | {{#if: {{#var: conf}}|==Configuration== | ||
Line 45: | Line 35: | ||
Busco uses a config file which needs to be copied and modified to your needs. | Busco uses a config file which needs to be copied and modified to your needs. | ||
$ cp $HPC_BUSCO_CONF/config.ini | $ cp $HPC_BUSCO_CONF/config.ini . | ||
$ export BUSCO_CONFIG_FILE= | $ export BUSCO_CONFIG_FILE=$(pwd)/config.ini | ||
$ | $ busco -f -i ... <other arguments> | ||
If you don't need to modify the config file you can use the installed copy: | |||
$ busco -f --config ${HPC_BUSCO_CONF}/config.ini -i ... <other arguments> | |||
[https://busco.ezlab.org/busco_userguide.html#mandatory-arguments Mandatory arguments] | |||
*-i or --in defines the input file to analyse which is either a nucleotide fasta file or a protein fasta file, depending on the BUSCO mode. As of v5.1.0 the input argument can now also be a directory containing fasta files to run in batch mode. | |||
*-o or --out defines the folder that will contain all results, logs, and intermediate data | |||
*-m or --mode sets the assessment MODE: genome, proteins, transcriptome | |||
*-l or --lineage_dataset | |||
Available datasets: | Datasets are located in /data/reference/busco/VERSION. The config.ini file is already configured to use the correct path. | ||
'''Available datasets:''' | |||
<div style="column-count:3"> | |||
*arthropoda | *arthropoda | ||
*bacteria | *bacteria | ||
Line 60: | Line 59: | ||
*metazoa | *metazoa | ||
*vertebrata | *vertebrata | ||
</div> | |||
and many more. Run the following command to see all available species: | and many more. Run the following command to see all available species: | ||
$ ls / | $ ls /data/reference/busco/VERSION | ||
#for example: $ ls /data/reference/busco/v5 | |||
Example of busco run with metazoa dataset: | Example of busco run with metazoa dataset: | ||
Line 68: | Line 69: | ||
To allow busco to retrain an existing Augustus dataset create a local copy of the data and set $AUGUSTUS_CONFIG_PATH variable to that path as explained on the [[Augustus]] page. | To allow busco to retrain an existing Augustus dataset create a local copy of the data and set $AUGUSTUS_CONFIG_PATH variable to that path as explained on the [[Augustus]] page. | ||
; | <div class="mw-collapsible mw-collapsed" style="width:70%; padding: 5px; border: 1px solid gray;"> | ||
''Expand this section to view an example, copying aspergillus nidulans.'' | |||
<div class="mw-collapsible-content" style="padding: 5px;"> | |||
Let's copy aspergillus_nidulans | Let's copy aspergillus_nidulans | ||
mkdir -p augustus/species | mkdir -p augustus/species | ||
# Load the busco module and copy augustus data | |||
#*<pre>cp $AUGUSTUS_CONFIG_PATH/species/aspergillus_nidulans/ augustus/species/</pre> | |||
# Copy the models | |||
#*<pre> cp $AUGUSTUS_CONFIG_PATH/model augustus/</pre> | |||
#Add this to the busco job script and submit it. | |||
Add | #*<pre>export AUGUSTUS_CONFIG_PATH=$(pwd)/augustus</pre></div></div> | ||
|}} | |}} | ||
<!--PBS scripts--> | <!--PBS scripts--> |
Latest revision as of 19:38, 27 March 2023
Description
BUSCO stands for Assessing genome assembly and annotation completeness with Benchmarking Universal Single-Copy Orthologs
Environment Modules
Run module spider busco
to find out what environment modules are available for this application.
System Variables
- HPC_BUSCO_DIR - installation directory
Additional Information
Busco uses a config file which needs to be copied and modified to your needs.
$ cp $HPC_BUSCO_CONF/config.ini . $ export BUSCO_CONFIG_FILE=$(pwd)/config.ini $ busco -f -i ... <other arguments>
If you don't need to modify the config file you can use the installed copy:
$ busco -f --config ${HPC_BUSCO_CONF}/config.ini -i ... <other arguments>
- -i or --in defines the input file to analyse which is either a nucleotide fasta file or a protein fasta file, depending on the BUSCO mode. As of v5.1.0 the input argument can now also be a directory containing fasta files to run in batch mode.
- -o or --out defines the folder that will contain all results, logs, and intermediate data
- -m or --mode sets the assessment MODE: genome, proteins, transcriptome
- -l or --lineage_dataset
Datasets are located in /data/reference/busco/VERSION. The config.ini file is already configured to use the correct path.
Available datasets:
- arthropoda
- bacteria
- eukaryota
- fungi
- metazoa
- vertebrata
and many more. Run the following command to see all available species:
$ ls /data/reference/busco/VERSION #for example: $ ls /data/reference/busco/v5
Example of busco run with metazoa dataset:
busco -f -in target.fa -o SAMPLE -l ${HPC_BUSCO_DAT}/metazoa -m genome
To allow busco to retrain an existing Augustus dataset create a local copy of the data and set $AUGUSTUS_CONFIG_PATH variable to that path as explained on the Augustus page.
Expand this section to view an example, copying aspergillus nidulans.
Citation
If you publish research that uses busco you have to cite it as follows:
BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Felipe A. Simão, Robert M. Waterhouse, Panagiotis Ioannidis, Evgenia V. Kriventseva, and Evgeny M. Zdobnov Bioinformatics, published online June 9, 2015