Difference between revisions of "SRA"

From UFRC
Jump to navigation Jump to search
Line 54: Line 54:
 
for a given cSRA file and configure the user environment for the SRA Toolkit.
 
for a given cSRA file and configure the user environment for the SRA Toolkit.
 
<!--Location-->
 
<!--Location-->
{{App_Location|app={{#var:app}}|{{#var:ver}}}}
+
==Available versions==
Available versions:
 
 
* 2.1.7
 
* 2.1.7
 
<!-- -->
 
<!-- -->
Line 62: Line 61:
 
* HPC_SRA_BIN - location of the executables directory
 
* HPC_SRA_BIN - location of the executables directory
 
* HPC_SRA_DOC - location of the documentation directory
 
* HPC_SRA_DOC - location of the documentation directory
 +
 +
==Aspera Connect==
 +
To download SRA data you can use the "ascp" utility from the [http://asperasoft.com/downloads/ Aspera Connect] browser plugin package. We have a copy installed and provided by the sra module. A wrapper script ''ascp.sh'' that automatically uses the ssh key is available. For instance:
 +
 +
ascp.sh -QT anonftp@ftp-private.ncbi.nlm.nih.gov:/genomes/Bacteria/all.faa.tar.gz faa
 +
 +
will download the all.faa.tar.gz archive to the faa directory.
 
{{#if: {{#var: exe}}|==How To Run==
 
{{#if: {{#var: exe}}|==How To Run==
 
WRITE INSTRUCTIONS ON RUNNING THE ACTUAL BINARY|}}
 
WRITE INSTRUCTIONS ON RUNNING THE ACTUAL BINARY|}}

Revision as of 04:55, 21 March 2012

Description

{{{name}}} website  
This is the NCBI Short Read Archive Toolkit.

Release notes:

SRA Toolkit 2.1.7a includes new features in sam-dump tool and vdb-dump tools.

Sam-dump now supports slicing across multiple sequences, and dumping cSRA files to fasta and fastq formats. In addition, sam-dump has three new parameters:

-=|--hide-identical              Output '=' if base is identical to reference
--gzip                           Compress output using gzip
--bzip2                          Compress output using bzip2

vdb-dump has two new parameters

-o|--column_enum_short           enumerates columns in short form
-b|--boolean                     defines how boolean's are printed (1,T)

We have combined the functionality of two scripts, config-assistant.perl and reference-assistant.perl into a single script, configuration-assistant.perl that helps users download the correct references for a given cSRA file and configure the user environment for the SRA Toolkit.

Available versions

  • 2.1.7

Running the application using modules

To use sra with the environment modules system at HPC the following commands are available:

Get module information for sra:

$module spider sra

Load the default application module:

$module load sra

The modulefile for this software adds the directory with executable files to the shell execution PATH and sets the following environment variables:

  • HPC_SRA_DIR - directory where sra is located.
  • HPC_SRA_BIN - location of the executables directory
  • HPC_SRA_DOC - location of the documentation directory

Aspera Connect

To download SRA data you can use the "ascp" utility from the Aspera Connect browser plugin package. We have a copy installed and provided by the sra module. A wrapper script ascp.sh that automatically uses the ssh key is available. For instance:

ascp.sh -QT anonftp@ftp-private.ncbi.nlm.nih.gov:/genomes/Bacteria/all.faa.tar.gz faa

will download the all.faa.tar.gz archive to the faa directory.