Impg

From UFRC
Jump to navigation Jump to search

Description

impg website  


impg (Implicit Pangenome Graph) projects sequence ranges through many-way (e.g. all-vs-all) pairwise alignments built by tools like wfmash and minimap2.


At its core, impg lifts over ranges from a target sequence into the other genomes described in alignments. In effect, it lets us pick up homologous loci from all genomes mapped onto our specific target region. This is particularly useful when you're interested in comparing a specific genomic region across different individuals, strains, or species in a pangenomic or comparative genomic setting. The output is provided in BED format, making it straightforward to use to extract FASTA sequences for downstream use in multiple sequence alignment (like mafft) or pangenome graph building (e.g., pggb or minigraph-cactus).


impg uses coitrees (implicit interval trees) to provide efficient range lookup over the input alignments. CIGAR strings are converted to a compact delta encoding. This approach allows for fast and memory-efficient projection of sequence ranges through alignments.


Environment Modules

Run module spider impg to find out what environment modules are available for this application.

System Variables

  • HPC_IMPG_DIR - installation directory
  • HPC_IMPG_BIN - executable directory