Difference between revisions of "GPU Access"

Latest revision as of 18:13, 18 January 2024

Interactive OnDemand Jobs in the GPU partition are limited to 12 hrs. Computational GPU jobs are limited to 14 days. Each GPU job requires at least one CPU core

Normalized Graphics Processor Units (NGUs) include all of the infrastructure (memory, network, rack space, cooling) necessary for GPU-accelerated computation. Each NGU is equivalent to 1 GPU presently, however newer GPUs such as the A100s may require more than 1 NGU to access in the future.

Researchers can add NGUs to their allocations by filling out the Purchase Form or requesting a Trial Allocation.

Open On Demand Access

To access GPUs using Open OnDemand, please check the form for your application. If your application supports multiple GPU types, choose the GPU partition and specify number of GPUs and type:

To request access to one GPU (of any type, use this gres string):

gpu:1

To request multiple GPUs (of any type, use this gres string were n is the number of GPUs you need):

gpu:n

To request a specific type of GPU, use this gres string (requesting geforce GPUs in this example):

gpu:geforce:1

To request a A100 GPU, use this gres string:

gpu:a100:1

GPU-enabled Services

Types of GPUs are listed below. Two partitions contain GPUs - the hwgui partition for visualization and the gpu partition for general computation.

Hardware Accelerated GUI

GPUs in these servers are used to accelerate rendering for graphical applications. These servers are in the SLURM "hwgui" partition. Refer to the Hardware Accelerated GUI Sessions page for more information on available resources and usage.

GPU Assisted Computation

A number of high performance applications installed on HiPerGator implement GPU-accelerated computing functions via CUDA to achieve significant speed-up over CPU calculations. These servers are in the SLURM "gpu" partition (--partition=gpu).

Hardware Specifications for the GPU Partition

We have the following types of NVIDIA GPU nodes available in the "gpu" partition:

GPU Specs	Host Quantity	Host Architecture	Host Memory	Host Interconnect	CPUs per Host	CPUS per Socket	GPUs per Host	CPUs per GPU	Memory per GPU	SLURM Feature	GRES GPU type	Technical Ref
GeForce 1080Ti	1	Intel Haswell	128 GB	FDR IB	28	14	2	14	11GB	n/a	geforce	Specifications
GeForce 2080Ti	32	Intel Skylake	187 GB	EDR IB	32	16	8	4	11GB	2080ti	geforce	Specifications
GeForce 2080Ti	38	Intel Cascade Lake	187 GB	EDR IB	32	16	8	4	11GB	2080ti	geforce	Specifications
Quadro RTX 6000 SLI	6	Intel Cascade Lake	187 GB	EDR IB	32	16	8	4	23GB	rtx6000	quadro	Specifications
NVIDIA A100 NVSWITCH	140	AMD EPYC ROME	2 TB	HDR IB	128	16	8	16	80GB	a100	a100	Specifications

For a list of additional node features, see the Available Node Features page.

To select a specific type of GPU within a partition please use either a SLURM constraint (e.g. --constraint=rtx6000) or a GRES with the needed GPU type (--gres or --gpu=a100:1).

Compiling CUDA Enabled Programs

The most direct way to develop a custom GPU accelerated algorithm is with the CUDA programming, please refer to the Nvidia CUDA Toolkit page. The current CUDA environment is cuda/11. However, C++ or Python packages numba and PyCuda are other ways to program GPU algorithms.

Multiple GPUs

Find the following resource for Multi-GPU Training.

Slurm and GPU Use

View instructions for using GPUs and scheduling GPU jobs with SLURM at Slurm and GPU Use

@@ Line 1: / Line 1: @@
-[[Category:SLURM]]
+[[Category:Scheduler]][[Category:GPU]]
-Researchers may use GPUs in the form of Normalized Graphics Processor Units (NGUs), which include all of the infrastructure (memory, network, rack space, cooling), necessary for GPU-accelerated computation.
+{|align=right
+  |__TOC__
+  |}
+{{Note|Interactive OnDemand Jobs in the GPU partition are limited to 12 hrs. Computational GPU jobs are limited to 14 days. Each GPU job requires at least one CPU core|warn}}
-Groups that do not have GPU allocations can invest into GPUs by filling out the purchase form at: https://www.rc.ufl.edu/services/purchase-request/.
+Normalized Graphics Processor Units (NGUs) include all of the infrastructure (memory, network, rack space, cooling) necessary for GPU-accelerated computation. Each NGU is equivalent to 1 GPU presently, however newer GPUs such as the A100s may require more than 1 NGU to access in the future.
-=GPU-enabled Servers=
+Researchers can add NGUs to their allocations by filling out the [https://www.rc.ufl.edu/get-started/purchase-allocation/ Purchase Form] or requesting a [https://www.rc.ufl.edu/services/request-trial-allocation/ Trial Allocation].
-We have two types of GPU services for two different kinds of applications.
+==Open On Demand Access==
+To access GPUs using [https://help.rc.ufl.edu/doc/Open_OnDemand Open OnDemand], please check the form for your application.  If your application supports multiple GPU types, choose the GPU partition and specify number of GPUs and type:
+<div style="column-count:2">
+*To request access to one GPU (of any type, use this gres string):
+ gpu:1
-== Hardware Accelerated GUI ==
+*To request multiple GPUs (of any type, use this gres string were n is the number of GPUs you need):
+ gpu:n
-GPUs are used for hardware accelerated graphical applications. To run this type of applications on HiPerGator, please use SLURM partition "'''hwgui'''" and refer to '''[[Hardware Accelerated GUI Sessions]]''' for more information on the usage.
+*To request a specific type of GPU, use this gres string (requesting geforce GPUs in this example):
+ gpu:geforce:1
-== GPU Assisted Computation ==
+*To request a A100 GPU, use this gres string:
+ gpu:a100:1
+</div>
-A number of high performance applications installed on HiPerGator implement GPU-accelerated computing functions via CUDA to achieve significant speed-up over CPU implementations. Please use SLURM partition '''"gpu"''' to run GPU enabled computational applications.
-=== GPU Specification for GPU Partition===
+==GPU-enabled Services==
-We have three types of NVIDIA GPU nodes currently available in "gpu" partition:
+Types of GPUs are listed below. Two partitions contain GPUs - the hwgui partition for visualization and the gpu partition for general computation.
-Nvidia K80s, with 2 GPUs per K80 card and 2 K80 cards in one host. Please refer to   [https://www.rc.ufl.edu main web site] [https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/tesla-product-literature/Tesla-K80-BoardSpec-07317-001-v05.pdf K80 technical specs]
+=== Hardware Accelerated GUI ===
-Nvidia
+GPUs in these servers are used to accelerate rendering for graphical applications. These servers are in the SLURM "'''hwgui'''" partition. Refer to the '''[[Hardware Accelerated GUI Sessions]]''' page for more information on available resources and usage.
+=== GPU Assisted Computation ===
+A number of high performance applications installed on HiPerGator implement GPU-accelerated computing functions via CUDA to achieve significant speed-up over CPU calculations. These servers are in the SLURM '''"gpu"''' partition (<code>--partition=gpu</code>).
+==== Hardware Specifications for the GPU Partition====
+We have the following types of NVIDIA GPU nodes available in the "gpu" partition:
 {| style="margin-left: 5px; width:80%"
 |
 {| class="wikitable" style="text-align: center"
-!GPU!!Quantity!!Host Quantity!!Host Architecture!!Host Memory!!Host Interconnect!!CPUs per Host!!Memory per GPU
+!GPU Specs!!Host Quantity!!Host Architecture!!Host Memory!!Host Interconnect!!CPUs per Host!!CPUS per Socket!!GPUs per Host!!CPUs per GPU!!Memory per GPU!!SLURM Feature!!GRES GPU type!!Technical Ref
 |-
-| style="width: 12%;"|Tesla K80||80||20||INTEL E5-2683||128 GB||FDR IB||28||12GB
+| style="width: 14%;"|GeForce 1080Ti||1||Intel Haswell||128 GB||FDR IB||28||14||2||14||11GB||n/a||geforce||[https://www.geforce.com/hardware/desktop-gpus/geforce-gtx-1080-ti/specifications Specifications]
+|-
+| style="width: 14%;"|GeForce 2080Ti||32||Intel Skylake||187 GB||EDR IB||32||16||8||4||11GB||2080ti||geforce||[https://www.nvidia.com/en-us/geforce/graphics-cards/rtx-2080-ti Specifications]
+|-
+| style="width: 14%;"|GeForce 2080Ti||38||Intel Cascade Lake||187 GB||EDR IB||32||16||8||4||11GB||2080ti||geforce||[https://www.nvidia.com/en-us/geforce/graphics-cards/rtx-2080-ti Specifications]
+|-
+| style="width: 14%;"|Quadro RTX 6000 SLI||6||Intel Cascade Lake||187 GB||EDR IB||32||16||8||4||23GB||rtx6000||quadro||[https://www.nvidia.com/en-us/design-visualization/quadro/rtx-6000/ Specifications]
+|-
+| style="width: 14%;"|NVIDIA A100 [https://www.nvidia.com/en-us/data-center/nvlink/ NVSWITCH]||140||AMD EPYC ROME||2 TB||HDR IB||128||16||8||16||80GB||a100||a100||[https://www.nvidia.com/en-us/data-center/a100/ Specifications]
 |}
 |}
-== Compile CUDA Enabled Programs ==
+For a list of additional node features, see the [[Available Node Features]] page.
-To compile CUDA programs, please refer to [[Nvidia CUDA Toolkit]]
-== GPU Use Policy ==
-'''Warning''':
-* GPUs are allocated only via the investment QOS. There is no burst QOS in the gpu partition. There are few GPUs on HiPerGator because of the high cost of GPU cards, so there is no spare capacity. Purchased GPUs need to be available for users who invested into GPU resources.
-* Time Limit for the gpu partition is 7 days (at most <code>#SBATCH --time=7-00:00:00</code>) to increase the availability of GPU resources.
-===Interactive Access (SLURM)===
-In order to request interactive access to a GPU under SLURM, use commands similar to those that follow.
-:'''•''' To request access to one GPU (of any type) for a default 10-minute session:
-::<source lang=bash>srun -p gpu --gres=gpu:1 --pty -u bash -i</source>
-:'''•''' To request access to two Tesla GPUs on a single node for a 1-hour session:
-::<source lang=bash>srun -p gpu --gres=gpu:tesla:2 --time=01:00:00  --pty -u bash -i</source>
-If no units are accessible, your request will be queued and your connection established once the next GPU becomes available. Otherwise, you may choose to try connecting again at a later time. If you have requested for a longer time than is needed, please be sure to end your session so that the GPU will be available for other users.
-=== Batch Jobs (SLURM) ===
-For batch jobs, to request GPU resources, use lines similar to the following in your submission script.
-:'''•''' In this example, two Tesla GPUs on a single server (--nodes defaults to "1") will be allocated to the job:
-<source lang=bash>
-#SBATCH --partition=gpu
-#SBATCH --gres=gpu:tesla:2
-</source>
-===Exclusive Mode===
-The GPUs are configured to run in '''exclusive''' mode.  This means that the gpu driver will only allow one process at a time to access the GPU.  If GPU 0 is in use and your application tries to use it, it will simply block.  If your application does not call cudaSetDevice(), the CUDA runtime should assign it to a free GPU.  Since everyone will be accessing the GPUs through the batch system, there should be no over-subscription of the GPUs.
-== Job Script Examples ==
-===Hybrid MPI/Threaded===
-This is a sample script for a hybrid MPI/threaded Gromacs job requesting and using GPUs under SLURM:
-<source lang=bash>
-#!/bin/bash
-#SBATCH --job-name=gromacs_gpu
-#SBATCH --output=gromacs_%j.out
-#SBATCH --error=gromacs_%j.err
-#SBATCH --partition=gpu
-#SBATCH --mail-type=END,FAIL
-#SBATCH --mail-user=user@some.domain.com
-#SBATCN --nodes=1
-#SBATCH --ntasks=2
-#SBATCH --cpus-per-task=7
-#SBATCH --ntasks-per-socket=1
-#SBATCH --mem-per-cpu=2600mb
-#SBATCH --distribution=cyclic:block
-#SBATCH --gres=gpu:tesla:2
-#SBATCH --time=6:00:00
-echo "Date      = $(date)"
+To select a specific type of GPU within a partition please use either a SLURM constraint (e.g. --constraint=rtx6000) or a GRES with the needed GPU type (--gres or --gpu=a100:1).
-echo "host      = $(hostname -s)"
-echo "Directory = $(pwd)"
-module load intel/2017 openmpi/3.0.0 cuda/9.1.85 gromacs/2018
+== Compiling CUDA Enabled Programs ==
-GROMACS=gmx
+The most direct way to develop a custom GPU accelerated algorithm is with the CUDA programming, please refer to the [[Nvidia CUDA Toolkit]] page. The current CUDA environment is cuda/11. However, C++ or Python packages numba and PyCuda are other ways to program GPU algorithms.
-export OMP_NUM_THREADS=7
-T1=$(date +%s)
+== Multiple GPUs ==
-srun --mpi=pmix $GROMACS mdrun -v -deffnm topol
+Find the following resource for [https://github.com/YunchaoYang/MultiGPUTraining2023 Multi-GPU Training].
-T2=$(date +%s)
-ELAPSED=$((T2 - T1))
+== Slurm and GPU Use ==
-echo "Elapsed Time = $ELAPSED"
+View instructions for using GPUs and scheduling GPU jobs with SLURM at [[Slurm and GPU Use]]
-</source>