Difference between revisions of "BUSCO"
Moskalenko (talk | contribs) |
|||
(10 intermediate revisions by 2 users not shown) | |||
Line 35: | Line 35: | ||
--> | --> | ||
==System Variables== | ==System Variables== | ||
− | * HPC_{{ | + | * HPC_{{uc:{{#var:app}}}}_DIR - installation directory |
<!--Configuration--> | <!--Configuration--> | ||
{{#if: {{#var: conf}}|==Configuration== | {{#if: {{#var: conf}}|==Configuration== | ||
Line 42: | Line 42: | ||
<!--Run--> | <!--Run--> | ||
{{#if: {{#var: exe}}|==Additional Information== | {{#if: {{#var: exe}}|==Additional Information== | ||
+ | |||
+ | Busco uses a config file which needs to be copied and modified to your needs. | ||
+ | |||
+ | $ cp $HPC_BUSCO_CONF/config.ini /home/username/config.ini | ||
+ | $ export BUSCO_CONFIG_FILE=/home/username/config.ini | ||
+ | $ run_BUSCO.py | ||
+ | |||
Datasets are located in /ufrc/data/reference/busco/ | Datasets are located in /ufrc/data/reference/busco/ | ||
Line 53: | Line 60: | ||
*metazoa | *metazoa | ||
*vertebrata | *vertebrata | ||
+ | and many more. Run the following command to see all available species: | ||
+ | $ ls /ufrc/data/reference/busco/ | ||
+ | Example of busco run with metazoa dataset: | ||
+ | busco -f -in target.fa -o SAMPLE -l ${HPC_BUSCO_DAT}/metazoa -m genome | ||
− | + | To allow busco to retrain an existing Augustus dataset create a local copy of the data and set $AUGUSTUS_CONFIG_PATH variable to that path as explained on the [[Augustus]] page. | |
− | + | ||
+ | ;Example: | ||
+ | |||
+ | Let's copy aspergillus_nidulans | ||
+ | |||
+ | mkdir -p augustus/species | ||
+ | |||
+ | * Load the busco module and copy augustus data | ||
+ | cp $AUGUSTUS_CONFIG_PATH/species/aspergillus_nidulans/ augustus/species/ | ||
+ | * Copy the models | ||
+ | cp $AUGUSTUS_CONFIG_PATH/model augustus/ | ||
− | + | Add | |
+ | export AUGUSTUS_CONFIG_PATH=$(pwd)/augustus | ||
+ | to the busco job script and submit it. | ||
|}} | |}} | ||
<!--PBS scripts--> | <!--PBS scripts--> |
Revision as of 22:26, 16 April 2020
Description
Assessing genome assembly and annotation completeness with Benchmarking Universal Single-Copy Orthologs
Required Modules
Serial
- busco
System Variables
- HPC_BUSCO_DIR - installation directory
Additional Information
Busco uses a config file which needs to be copied and modified to your needs.
$ cp $HPC_BUSCO_CONF/config.ini /home/username/config.ini $ export BUSCO_CONFIG_FILE=/home/username/config.ini $ run_BUSCO.py
Datasets are located in /ufrc/data/reference/busco/
Available datasets:
- arthropoda
- bacteria
- eukaryota
- fungi
- metazoa
- vertebrata
and many more. Run the following command to see all available species:
$ ls /ufrc/data/reference/busco/
Example of busco run with metazoa dataset:
busco -f -in target.fa -o SAMPLE -l ${HPC_BUSCO_DAT}/metazoa -m genome
To allow busco to retrain an existing Augustus dataset create a local copy of the data and set $AUGUSTUS_CONFIG_PATH variable to that path as explained on the Augustus page.
- Example
Let's copy aspergillus_nidulans
mkdir -p augustus/species
- Load the busco module and copy augustus data
cp $AUGUSTUS_CONFIG_PATH/species/aspergillus_nidulans/ augustus/species/
- Copy the models
cp $AUGUSTUS_CONFIG_PATH/model augustus/
Add
export AUGUSTUS_CONFIG_PATH=$(pwd)/augustus
to the busco job script and submit it.
Citation
If you publish research that uses busco you have to cite it as follows:
BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Felipe A. Simão, Robert M. Waterhouse, Panagiotis Ioannidis, Evgenia V. Kriventseva, and Evgeny M. Zdobnov Bioinformatics, published online June 9, 2015