Purge_haplotigs is designed to identify pairs of contigs that are syntenic and move one of them to the haplotig 'pool'. The pipeline uses mapped read coverage and Minimap2 alignments to determine which contigs to keep for the haploid assembly. Dotplots are optionally produced for all flagged contig matches, juxtaposed with read-coverage, to help the user determine the proper assignment of any remaining ambiguous contigs. The pipeline will run on either a haploid assembly (i.e. Canu, FALCON or FALCON-Unzip primary contigs) or on a phased-diploid assembly (i.e. FALCON-Unzip primary contigs + haplotigs). Here are two examples of how Purge Haplotigs can improve a haploid and diploid assembly.
module spider purge_haplotigs to find out what environment modules are available for this application.
- HPC_PURGE_HAPLOTIGS_DIR - installation directory
If you publish research that uses purge_haplotigs you have to cite it as follows:
The pipeline is published at BMC Bioinformatics:
Purge Haplotigs: allelic contig reassignment for third-gen diploid genome assemblies