Difference between revisions of "Reference Data"

From UFRC
Jump to navigation Jump to search
(Created page with "UFRC maintains a repository of reference data that can be accessed by all HiPerGator users. The primary purposes of this repository are convenience and efficient use of filesy...")
 
Line 1: Line 1:
UFRC maintains a repository of reference data that can be accessed by all HiPerGator users. The primary purposes of this repository are convenience and efficient use of filesystem space. We download and or build reference datasets and configure applications we make available through the [[Environment Modules System]] to automatically make use of the available reference data. We also would like to avoid seeing multiple users download many copies of the same large reference datasets that are of general interest to our clients. Having UFRC host common reference data also means that a research group does not have to use their Blue or Orange quota to host a redundant copy of common reference data.
+
UFRC maintains a repository of reference data that can be accessed by all HiPerGator users. The primary purposes of this repository are researcher convenience, efficient use of filesystem space, and cost savings. We are happy to download and build reference datasets and configure applications installed on HiPerGator to automatically make use of the available reference data. Having UFRC host common reference data means that a research group does not have to use their Blue or Orange quota to host redundant copies of common reference data.
  
 
Use [https://support.rc.ufl.edu https://support.rc.ufl.edu] to request either addition of reference data or to ask for an addition of a directory that you can put reference data into for shared use.
 
Use [https://support.rc.ufl.edu https://support.rc.ufl.edu] to request either addition of reference data or to ask for an addition of a directory that you can put reference data into for shared use.
Line 11: Line 11:
 
==Genome Data==
 
==Genome Data==
 
* 3k Rice genomes - reference/genomes/rice3k - from [http://gigadb.org/dataset/200001 http://gigadb.org/dataset/200001], downloaded from SRA.
 
* 3k Rice genomes - reference/genomes/rice3k - from [http://gigadb.org/dataset/200001 http://gigadb.org/dataset/200001], downloaded from SRA.
 +
 +
==AI Training and Validation Data and Models==
 +
* [http://sintel.is.tue.mpg.de MPI-Sintel]data - /data/reference/ai/data/MPI-Sintel/

Revision as of 13:53, 27 August 2020

UFRC maintains a repository of reference data that can be accessed by all HiPerGator users. The primary purposes of this repository are researcher convenience, efficient use of filesystem space, and cost savings. We are happy to download and build reference datasets and configure applications installed on HiPerGator to automatically make use of the available reference data. Having UFRC host common reference data means that a research group does not have to use their Blue or Orange quota to host redundant copies of common reference data.

Use https://support.rc.ufl.edu to request either addition of reference data or to ask for an addition of a directory that you can put reference data into for shared use.

The following is not an exhaustive list of the hosted reference data. If an existing reference is missing from the list below please let us know and we will update the list.

Application-Specific Data

Genome Data

AI Training and Validation Data and Models

  • MPI-Sinteldata - /data/reference/ai/data/MPI-Sintel/