The NCBI Prokaryotic Genome Annotation Pipeline is designed to annotate bacterial and archaeal genomes (chromosomes and plasmids). Genome annotation is a multi-level process that includes prediction of protein-coding genes, as well as other functional genome units such as structural RNAs, tRNAs, small RNAs and pseudogenes. NCBI has developed an automatic prokaryotic genome annotation pipeline that combines ab initio gene prediction algorithms with homology based methods. The first version of NCBI Prokaryotic Genome Pipeline was developed in 2001 and is regularly upgraded to improve structural and functional annotation quality (Li W, O'Neill KR et al 2021). Recent improvements include utilization of curated protein profile hidden Markov models (HMMs), and curated complex domain architectures for functional annotation of proteins and annotation of Enzyme Commission numbers and Gene Ontology terms.
module spider pgap to find out what environment modules are available for this application.
- HPC_PGAP_DIR - installation directory
- HPC_PGAP_BIN - executable directory
The PGAP module provides a wrapper function ("pgap.py") to fine tune usage in HPG. This means that you should just use "pgap.py ..." without a path when running the command. See the Job Scripts page below for an example.
Job Script Examples
See the PGAP_Job_Scripts page for PGAP Job script examples.
Please cite NCBI in any work or product based on this material.