Difference between revisions of "GATK"

From UFRC
Jump to navigation Jump to search
Line 31: Line 31:
 
<!--Additional-->
 
<!--Additional-->
 
{{#if: {{#var: exe}}|==Additional Information==
 
{{#if: {{#var: exe}}|==Additional Information==
We provide two wrapper scripts AnalyzeCovariates and GenomeAnalysisTK that are equivalent to running
+
We provide a wrapper script GenomeAnalysisTK that is equivalent to running
java -jar $HPC_GATK_DIR/GenomeAnalysisTK.jar
+
    mkdir -p tmp
and
+
    export TMPDIR=$(pwd)/tmp
java -jar $HPC_GATK_DIR/AnalyzeCovariates.jar
+
    java -Djava.io.tmpdir=$TMPDIR -cp /apps/gatk/jexl/2.1.1/commons-jexl-2.1.1.jar -jar $HPC_GATK_DIR/GenomeAnalysisTK.jar
 +
 
 +
If you do not use the wrapper you '''must''' make sure to create and use a local ''TMPDIR'' in your /ufrc space with GenomeAnalysisTK.jar. Otherwise /tmp will be used by default leading to filled up /tmp partitions on compute nodes and node failure.
 
|}}
 
|}}
 
{{#if: {{#var: conf}}|==Configuration==
 
{{#if: {{#var: conf}}|==Configuration==

Revision as of 17:31, 4 January 2017

Description

gatk website  

The GATK is a structured software library that makes writing efficient analysis tools using next-generation sequencing data very easy, and second it's a suite of tools for working with human medical resequencing projects such as 1000 Genomes and The Cancer Genome Atlas. These tools include things like a depth of coverage analyzers, a quality score recalibrator, a SNP/indel caller and a local realigner.

We aim to work well with both samtools and Picard by providing complementary tools to those available in those two packages. Our SNP calling pipeline (Q score recalibration -> multiple sequence realignment -> snp/index calling) is a particular area of focus, and have been pushing to make these capabilities as general-purpose and powerful as possible. My group's mandate is to ensure the success of the human medical resequencing projects we've undertaken at the Broad over the next 2-3 years, which involves providing a robust, production-quality development library that underlies tools for common analysis problems (like SNP calling) as well as enabling exploratory research on NGS data. Upstream documentation for gatk.

Required Modules

modules documentation

Serial

  • gatk

System Variables

  • HPC_{{#uppercase:gatk}}_DIR - installation directory

Additional Information

We provide a wrapper script GenomeAnalysisTK that is equivalent to running

   mkdir -p tmp
   export TMPDIR=$(pwd)/tmp
   java -Djava.io.tmpdir=$TMPDIR -cp /apps/gatk/jexl/2.1.1/commons-jexl-2.1.1.jar -jar $HPC_GATK_DIR/GenomeAnalysisTK.jar

If you do not use the wrapper you must make sure to create and use a local TMPDIR in your /ufrc space with GenomeAnalysisTK.jar. Otherwise /tmp will be used by default leading to filled up /tmp partitions on compute nodes and node failure.