Difference between revisions of "SRA"
Moskalenko (talk | contribs) m (Text replace - "==Running the application using modules==" to "==Execution Environment and Modules==") |
m (Fix formatting) |
||
(17 intermediate revisions by 2 users not shown) | |||
Line 2: | Line 2: | ||
__NOEDITSECTION__ | __NOEDITSECTION__ | ||
[[Category:Software]] | [[Category:Software]] | ||
− | + | {|<!--Main settings - REQUIRED--> | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | {| | ||
− | <!--Main settings - REQUIRED--> | ||
|{{#vardefine:app|sra}} | |{{#vardefine:app|sra}} | ||
|{{#vardefine:url|http://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?cmd=show&f=software&m=software&s=software}} | |{{#vardefine:url|http://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?cmd=show&f=software&m=software&s=software}} | ||
− | + | |{{#vardefine:exe|1}} <!--Present manual instructions for running the software --> | |
− | |{{#vardefine: | ||
− | |||
− | |||
− | |||
− | |||
|{{#vardefine:conf|}} <!--Enable config wiki page link - {{#vardefine:conf|1}} = ON/conf|}} = OFF--> | |{{#vardefine:conf|}} <!--Enable config wiki page link - {{#vardefine:conf|1}} = ON/conf|}} = OFF--> | ||
|{{#vardefine:pbs|}} <!--Enable PBS script wiki page link--> | |{{#vardefine:pbs|}} <!--Enable PBS script wiki page link--> | ||
Line 35: | Line 20: | ||
This is the NCBI Short Read Archive Toolkit. | This is the NCBI Short Read Archive Toolkit. | ||
− | + | ;Note: sra will create a $HOME/ncib/public directory and cache the prefetched data files there. However, home directory has a 20gb limit and its use for job data storage is a violation of the [https://www.rc.ufl.edu/about/policies/storage/ UFRC storage policy]. You must change that location to a directory in your ufrc space before running the sra toolkit. The official approach is to use the vdb-config tool | |
+ | vdb-config -i | ||
+ | and change the directory to, for example, /blue/$GROUP/$USER/ncbi/public. See the [https://github.com/ncbi/sra-tools/wiki/Toolkit-Configuration SRA Toolkit Configuration Documentation] for more details. | ||
− | + | Alternatively, create an 'ncbi' directory in your /blue space and symlink it to ~/ncbi. E.g. | |
− | + | $ mkdir /blue/mygroup/$USER/ncbi | |
− | + | $ ln -s /blue/mygroup/$USER/ncbi ~/ncbi | |
− | |||
− | + | ==Uploads== | |
− | + | It appears that data uploads to NCBI only work from login servers. Start a screen session before beginning an upload if there are any concerns about being disconnected. | |
− | + | <!--Modules--> | |
− | + | ==Required Modules== | |
− | + | [[Modules|modules documentation]] | |
− | + | ===Serial=== | |
− | + | *{{#var:app}} | |
− | + | ==System Variables== | |
− | + | * HPC_{{uc:{{#var:app}}}}_DIR - installation directory | |
− | |||
− | |||
− | <!-- | ||
− | == | ||
− | |||
− | |||
− | |||
− | |||
* HPC_SRA_BIN - location of the executables directory | * HPC_SRA_BIN - location of the executables directory | ||
* HPC_SRA_DOC - location of the documentation directory | * HPC_SRA_DOC - location of the documentation directory | ||
− | + | <!--Additional--> | |
− | ==Aspera Connect== | + | {{#if: {{#var: exe}}|==Additional Information== |
+ | ===Aspera Connect=== | ||
To download SRA data you can use the "ascp" utility from the [http://asperasoft.com/downloads/ Aspera Connect] browser plugin package. We have a copy installed and provided by the sra module. A wrapper script ''ascp.sh'' that automatically uses the ssh key is available. For instance: | To download SRA data you can use the "ascp" utility from the [http://asperasoft.com/downloads/ Aspera Connect] browser plugin package. We have a copy installed and provided by the sra module. A wrapper script ''ascp.sh'' that automatically uses the ssh key is available. For instance: | ||
− | |||
ascp.sh -QT anonftp@ftp-private.ncbi.nlm.nih.gov:/genomes/Bacteria/all.faa.tar.gz faa | ascp.sh -QT anonftp@ftp-private.ncbi.nlm.nih.gov:/genomes/Bacteria/all.faa.tar.gz faa | ||
− | |||
will download the all.faa.tar.gz archive to the faa directory. | will download the all.faa.tar.gz archive to the faa directory. | ||
− | '''Note:''' if the download fails to | + | '''Note:''' if the download fails to run with a "Session Stop (Error: Client unable to connect to server (check UDP port and firewall))" |
− | error | + | error please submit a support request. This means that the remote site has not been allowed through the firewall. Please be sure to include the path to a script you used to run the data transfer command into the request. Do not put any sensitive information like passwords, keys, and such into the request. |
− | + | |}} | |
− | |||
{{#if: {{#var: conf}}|==Configuration== | {{#if: {{#var: conf}}|==Configuration== | ||
See the [[{{PAGENAME}}_Configuration]] page for {{#var: app}} configuration details.|}} | See the [[{{PAGENAME}}_Configuration]] page for {{#var: app}} configuration details.|}} | ||
{{#if: {{#var: pbs}}|==PBS Script Examples== | {{#if: {{#var: pbs}}|==PBS Script Examples== | ||
See the [[{{PAGENAME}}_PBS]] page for {{#var: app}} PBS script examples.|}} | See the [[{{PAGENAME}}_PBS]] page for {{#var: app}} PBS script examples.|}} | ||
− | {{#if: {{#var: policy}}|==Usage | + | {{#if: {{#var: policy}}|==Usage Policy== |
WRITE USAGE POLICY HERE (perhaps templates for a couple of main licensing schemes can be used)|}} | WRITE USAGE POLICY HERE (perhaps templates for a couple of main licensing schemes can be used)|}} | ||
{{#if: {{#var: testing}}|==Performance== | {{#if: {{#var: testing}}|==Performance== |
Revision as of 19:05, 6 January 2022
Description
This is the NCBI Short Read Archive Toolkit.
- Note
- sra will create a $HOME/ncib/public directory and cache the prefetched data files there. However, home directory has a 20gb limit and its use for job data storage is a violation of the UFRC storage policy. You must change that location to a directory in your ufrc space before running the sra toolkit. The official approach is to use the vdb-config tool
vdb-config -i
and change the directory to, for example, /blue/$GROUP/$USER/ncbi/public. See the SRA Toolkit Configuration Documentation for more details.
Alternatively, create an 'ncbi' directory in your /blue space and symlink it to ~/ncbi. E.g.
$ mkdir /blue/mygroup/$USER/ncbi $ ln -s /blue/mygroup/$USER/ncbi ~/ncbi
Uploads
It appears that data uploads to NCBI only work from login servers. Start a screen session before beginning an upload if there are any concerns about being disconnected.
Required Modules
Serial
- sra
System Variables
- HPC_SRA_DIR - installation directory
- HPC_SRA_BIN - location of the executables directory
- HPC_SRA_DOC - location of the documentation directory
Additional Information
Aspera Connect
To download SRA data you can use the "ascp" utility from the Aspera Connect browser plugin package. We have a copy installed and provided by the sra module. A wrapper script ascp.sh that automatically uses the ssh key is available. For instance:
ascp.sh -QT anonftp@ftp-private.ncbi.nlm.nih.gov:/genomes/Bacteria/all.faa.tar.gz faa
will download the all.faa.tar.gz archive to the faa directory.
Note: if the download fails to run with a "Session Stop (Error: Client unable to connect to server (check UDP port and firewall))" error please submit a support request. This means that the remote site has not been allowed through the firewall. Please be sure to include the path to a script you used to run the data transfer command into the request. Do not put any sensitive information like passwords, keys, and such into the request.