Storage

From UFRC
Revision as of 21:11, 9 March 2021 by Moskalenko (talk | contribs)
Jump to navigation Jump to search

UF Research Computing maintains several shared storage systems that are intended for different user activities. General overview of UFRC procedures for using our filesystems can be found in https://www.rc.ufl.edu/help/getting-started/storage// and https://www.rc.ufl.edu/about/procedures/storage/types/. Here we discuss practical use of the filesystems on HiPerGator.

Home Storage

Your home directory is the first directory you see when you log into HiPerGator. It's always found at '~', '/home/$USER' or $HOME paths. The shell variables above can be used in scripts. The HOME directories are the smallest storage devices available to our users. They contain files important for setting up user shell environment and secure shell connections. Do not remove any .bash* files or the .ssh directory or you will have problems using your HiPerGator account. Let us know if some of them were removed by accident, so we could reset the files to standard versions.

The first rule of using the HOME directory is to not use it for reading or writing data files in any analyses run on HiPerGator. It is permissible to keep software builds, conda environments, text documents, and valuable scripts in $HOME as it is somewhat protected by daily snapshots.

Blue Storage

Blue Storage is our main high-performance parallel filesystem. This is where all job input/output a.k.a 'job i/o' or reading and writing files must happen. By default your personal directory tree will start at '/blue/GROUP/USER'. That directory cannot be modified by other group members. There is a shared directory at '/blue/GROUP/share' for groups that prefer to share all their data between group members. The parallel nature of the Blue Storage makes it very efficient at reading and writing large files, which can be 'striped' or broken into pieces to be stored on different storage servers. It does not deal well with directories that have a large number of very small files. If a job produces those it is advisable to make use of the Temporary Directories to alleviate the burden on Blue Storage and make it more responsive and performant for everyone. For groups that purchased separate storage for additional projects the default path to the project directories is '/blue/PROJECT'. That directory is set up similarly to the 'share' directory in the primary group directory tree.

Orange Storage

As described in the UFRC filesystem document above Orange storage is cheaper than Blue, but that means that it cannot support the full brunt of the applications running on HiPerGator. Limit its use to long-term storage of data that's not currently in use or for very gentle access like serial reading of raw data for QC/Filtering with the output of that first step in many workflows going to your Blue Storage directory tree. Do not be alarmed if you do not see your Orange Storage directory with 'ls' at first. Orange Storage is connected (mounted) on demand, so you have to 'use' it before it becomes visible. For example, changing into your orange directory will connect and make it visible. so use

cd /orange/mygroup

before using

ls /orange/mygroup

to avoid surprises.