Difference between revisions of "New user training"

From UFRC
Jump to navigation Jump to search
Line 11: Line 11:
 
# Locate where to receive user support
 
# Locate where to receive user support
 
# Identify common user mistakes and how to avoid them.
 
# Identify common user mistakes and how to avoid them.
 
  
  
Line 23: Line 22:
 
** 1,120 NVIDIA A100 GPUs
 
** 1,120 NVIDIA A100 GPUs
 
** 17,000 AMD Rome Epyc Cores
 
** 17,000 AMD Rome Epyc Cores
 +
 +
For additional information visit our website: https://www.rc.ufl.edu/
  
 
'''Summary:''' HiPerGator is a large, high-performance compute cluster capable of tackling some of the largest computational challenges, but users need to understand how to responsibly and efficiently use the resources.
 
'''Summary:''' HiPerGator is a large, high-performance compute cluster capable of tackling some of the largest computational challenges, but users need to understand how to responsibly and efficiently use the resources.
Line 40: Line 41:
 
* Price sheets are located here: https://www.rc.ufl.edu/services/rates/
 
* Price sheets are located here: https://www.rc.ufl.edu/services/rates/
 
* Submit a purchase request here: https://www.rc.ufl.edu/services/purchase-request/
 
* Submit a purchase request here: https://www.rc.ufl.edu/services/purchase-request/
 +
 +
===Module 2: How to Access and Run Jobs===
 +
 +
====Cluster Components====
 +
# [[Development and Testing|Login servers]]
 +
# [[Sample SLURM Scripts|SLURM Scheduler]]
 +
# [https://www.rc.ufl.edu/about/cluster-history/ Compute Cluster]
 +
 +
====Accessing HiPerGator====
 +
* [[Training#Connecting_to_HiPerGator|ssh to host hpg.rc.ufl.edu]]
 +
* [https://jhub.rc.ufl.edu/ jhub.rc.ufl.edu] (requires UF Network)
 +
** See also [[Training#Using_Jupyter_Notebooks_on_HiPerGator|overview video]].
 +
* [https://galaxy.rc.ufl.edu/ galaxy.rc.ufl.edu]
 +
* Open on Demand: [https://ood.rc.ufl.edu/ ood.rc.ufl.edu] (requires UF Network)
 +
** See also [[Training#Using_Open_on_Demand_on_HiPerGator|overview video]].

Revision as of 16:08, 11 August 2020

New User Training

This page mirrors and expands upon the content provided in the New User Training module in myTraining. The New User Training module is required for all new account holders within two weeks of obtaining a new account. Users who do not complete the training will have their account deactivated until the training is completed.

Training Objectives

  1. Recognize the role of Research Computing, utilize HiPerGator as a research tool and select appropriate resource allocations for analyses
  2. Log into HiPerGatos using an ssh client
  3. Describe appropriate use of the login servers and how to request resources for work beyond those limits
  4. Describe HiPerGator's three main storage systems and the appropriate use for each
  5. Use the module system for loading application environments
  6. Locate where to receive user support
  7. Identify common user mistakes and how to avoid them.


Module 1: Introduction to Research Computing and HiPerGator

HiPerGator

  • 46,000 cores
  • Hundreds of GPUs
  • 10 Petabytes of storage
  • New HiPerGator AI cluster will add
    • 1,120 NVIDIA A100 GPUs
    • 17,000 AMD Rome Epyc Cores

For additional information visit our website: https://www.rc.ufl.edu/

Summary: HiPerGator is a large, high-performance compute cluster capable of tackling some of the largest computational challenges, but users need to understand how to responsibly and efficiently use the resources.

Investor Supported

HiPerGator is heavily subsidized by the university, but we do require faculty researchers to make investments for access. Research Computing sell three main products:

  1. Compute: NCUs (Normalized Compute Units)
    • 1 CPU core and 3.5 GB of RAM
  1. Storage:
    • Blue: High-performance
    • Orange: Intended for archival use
  1. GPUs
    • Sold in units of GPU cards
    • NCU investment also required to make use of GPU

Investments can either be hardware investments, lasting for 5-years or service investments lasting 3-months or longer.

Module 2: How to Access and Run Jobs

Cluster Components

  1. Login servers
  2. SLURM Scheduler
  3. Compute Cluster

Accessing HiPerGator