BUSCO

From UFRC
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Description

busco website  

BUSCO stands for Assessing genome assembly and annotation completeness with Benchmarking Universal Single-Copy Orthologs

Environment Modules

Run module spider busco to find out what environment modules are available for this application.

System Variables

  • HPC_BUSCO_DIR - installation directory

Additional Information

Busco uses a config file which needs to be copied and modified to your needs.

$ cp $HPC_BUSCO_CONF/config.ini .
$ export BUSCO_CONFIG_FILE=$(pwd)/config.ini
$ busco -f -i ... <other arguments>

If you don't need to modify the config file you can use the installed copy:

$ busco -f --config ${HPC_BUSCO_CONF}/config.ini -i ... <other arguments>

Mandatory arguments

  • -i or --in defines the input file to analyse which is either a nucleotide fasta file or a protein fasta file, depending on the BUSCO mode. As of v5.1.0 the input argument can now also be a directory containing fasta files to run in batch mode.
  • -o or --out defines the folder that will contain all results, logs, and intermediate data
  • -m or --mode sets the assessment MODE: genome, proteins, transcriptome
  • -l or --lineage_dataset

Datasets are located in /data/reference/busco/VERSION. The config.ini file is already configured to use the correct path.

Available datasets:

  • arthropoda
  • bacteria
  • eukaryota
  • fungi
  • metazoa
  • vertebrata

and many more. Run the following command to see all available species:

$ ls /data/reference/busco/VERSION
#for example: $ ls /data/reference/busco/v5

Example of busco run with metazoa dataset:

busco -f -in target.fa -o SAMPLE -l ${HPC_BUSCO_DAT}/metazoa -m genome

To allow busco to retrain an existing Augustus dataset create a local copy of the data and set $AUGUSTUS_CONFIG_PATH variable to that path as explained on the Augustus page.

Expand this section to view an example, copying aspergillus nidulans.

Let's copy aspergillus_nidulans

mkdir -p augustus/species

  1. Load the busco module and copy augustus data
    • cp $AUGUSTUS_CONFIG_PATH/species/aspergillus_nidulans/ augustus/species/
  2. Copy the models
    •  cp $AUGUSTUS_CONFIG_PATH/model augustus/
  3. Add this to the busco job script and submit it.
    • export AUGUSTUS_CONFIG_PATH=$(pwd)/augustus



Citation

If you publish research that uses busco you have to cite it as follows:

BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Felipe A. Simão, Robert M. Waterhouse, Panagiotis Ioannidis, Evgenia V. Kriventseva, and Evgeny M. Zdobnov Bioinformatics, published online June 9, 2015