This is the NCBI Short Read Archive Toolkit.
- sra will create a $HOME/ncib/public directory and cache the prefetched data files there. However, home directory has a 20gb limit and its use for job data storage is a violation of the UFRC storage policy. You must change that location to a directory in your ufrc space before running the sra toolkit. The official approach is to use the vdb-config tool
and change the directory to, for example, /ufrc/$GROUP/$USER/ncbi/public. See the SRA Toolkit Configuration Documentation for more details.
Alternatively, create an 'ncbi' directory in your /ufrc space and symlink it to ~/ncbi. E.g.
$ mkdir /ufrc/mygroup/$USER/ncbi $ ln -s /ufrc/mygroup/$USER/ncbi ~/ncbi
It appears that data uploads to NCBI only work from login servers. Start a screen session before beginning an upload if there are any concerns about being disconnected.
- HPC_SRA_DIR - installation directory
- HPC_SRA_BIN - location of the executables directory
- HPC_SRA_DOC - location of the documentation directory
To download SRA data you can use the "ascp" utility from the Aspera Connect browser plugin package. We have a copy installed and provided by the sra module. A wrapper script ascp.sh that automatically uses the ssh key is available. For instance:
ascp.sh -QT email@example.com:/genomes/Bacteria/all.faa.tar.gz faa
will download the all.faa.tar.gz archive to the faa directory.
Note: if the download fails to run with a "Session Stop (Error: Client unable to connect to server (check UDP port and firewall))" error please submit a support request. This means that the remote site has not been allowed through the firewall. Please be sure to include the path to a script you used to run the data transfer command into the request. Do not put any sensitive information like passwords, keys, and such into the request.