HapTree

From UFRC
Jump to navigation Jump to search

Description

haptree website  

HapTree is a polyploid haplotype assembly tool based on a statistical framework.

Required Modules

Serial

  • haptree

System Variables

  • HPC_HAPTREE_DIR - installation directory
  • HPC_HAPTREE_DOC - documentation directory
  • HPC_HAPTREE_EXE - examples directory

Additional Information

HapTree README

HapTree v1.0 is intended for diploid data and arbitrary ReadGraphs. Polyploid data is not supported in this version, but will be available in the next version. To run simple polyploid examples, see the previous version.

HapTree takes in 3 arguments, READS, VCF file, and OUTPUTNAME. To run HapTree type: HapTree READS VCF OUTPUTNAME The order of the arguments matters; they do not have labels.

The format of READS is a fragment file; we use the same format as HapCut. At this time we recommend using HapCut's extracthairs to translate from BAM to fragment file. See the example files for more details. The format of VCF is the standard VCF format, and it must be sorted. The indices of the SNPs in READS _MUST_ match those in the VCF file.

The third argument is whatever you wish to name your output folder.

To check the number of switch errors of a particular SOLUTION against a partially phased_VCF file, run: scoring SOLUTION phased_VCF. The solution does not have to be a HapTree solution, it just must match the HapTree solution format. The HapTree solution format is similar to that of HapCut; an example and explanation follows.

Each phased block is separated by ***** and labeled with the SNP starting the block, the length of the block in SNPs, the number of SNPs phased, the distance in the genome between the first and last SNPs phased, the MEC score of the block, and the number of reads covering the block. We have no phasing information between separate blocks because we have no read data to support an opinion. In the solution, a 0 represents the reference allele and a 1 the alternative allele.

Two phased blocks are below:

BLOCK Start: 1514 Len: 2 Phased: 2 Span: 80 MEC: 0 Reads: 19 
1514    0       1       4275998 
1515    1       0       4276078 
*****
BLOCK Start: 1518 Len: 4 Phased: 4 Span: 178 MEC: 0 Reads: 114 
1518    0       1       4279142 
1519    0       1       4279211 
1520    0       1       4279319 
1521    0       1       4279320 
***** 




Validation

  • Validated 4/5/2018