The GDC provides a standard client-based mechanism in support of high performance data downloads and submission. The raw sequence files, typically stored as BAM or FASTQ, make up the bulk of data. The size for a single file can vary greatly depending on the specific analysis; However, some of the whole genome BAM files in The Cancer Genome Atlas (TCGA) reach sizes of 200-300 GB. In such cases, a high performance data download and submission client is essential.
- HPC_GDC-CLIENT_DIR - installation directory