Difference between revisions of "SRA"
Jump to navigation
Jump to search
Moskalenko (talk | contribs) |
Moskalenko (talk | contribs) |
||
Line 20: | Line 20: | ||
This is the NCBI Short Read Archive Toolkit. | This is the NCBI Short Read Archive Toolkit. | ||
− | + | ;Note: sra will create a $HOME/ncib/public directory and cache the prefetched data files there. However, home directory has a 20gb limit and its use for job data storage is a violation of the [https://www.rc.ufl.edu/about/policies/storage/ UFRC storage policy]. You must change that location to a directory in your ufrc space before running the sra toolkit. The official approach is to use the vdb-config tool | |
+ | vdb-config -i | ||
+ | and change the directory to, for example, /ufrc/$GROUP/$USER/ncbi/public. | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
<!--Modules--> | <!--Modules--> | ||
==Required Modules== | ==Required Modules== |
Revision as of 19:03, 30 November 2016
Description
This is the NCBI Short Read Archive Toolkit.
- Note
- sra will create a $HOME/ncib/public directory and cache the prefetched data files there. However, home directory has a 20gb limit and its use for job data storage is a violation of the UFRC storage policy. You must change that location to a directory in your ufrc space before running the sra toolkit. The official approach is to use the vdb-config tool
vdb-config -i
and change the directory to, for example, /ufrc/$GROUP/$USER/ncbi/public.
Required Modules
Serial
- sra
System Variables
- HPC_{{#uppercase:sra}}_DIR - installation directory
- HPC_SRA_BIN - location of the executables directory
- HPC_SRA_DOC - location of the documentation directory
Additional Information
Aspera Connect
To download SRA data you can use the "ascp" utility from the Aspera Connect browser plugin package. We have a copy installed and provided by the sra module. A wrapper script ascp.sh that automatically uses the ssh key is available. For instance:
ascp.sh -QT anonftp@ftp-private.ncbi.nlm.nih.gov:/genomes/Bacteria/all.faa.tar.gz faa
will download the all.faa.tar.gz archive to the faa directory.
Note: if the download fails to start on the first try with a "Session Stop (Error: Client unable to connect to server (check UDP port and firewall))" error just re-run the command. It's a DNS (host name resolution) problem, which will resolve itself.