Difference between revisions of "GPU Access"
Moskalenko (talk | contribs) |
|||
Line 103: | Line 103: | ||
== Exclusive Mode == | == Exclusive Mode == | ||
The GPUs are configured to run in '''exclusive''' mode. This means that the gpu driver will only allow one process at a time to access the GPU. If GPU 0 is in use and your application tries to use it, it will simply block. If your application does not call cudaSetDevice(), the CUDA runtime should assign it to a free GPU. Since everyone will be accessing the GPUs through the batch system, there should be no over-subscription of the GPUs. | The GPUs are configured to run in '''exclusive''' mode. This means that the gpu driver will only allow one process at a time to access the GPU. If GPU 0 is in use and your application tries to use it, it will simply block. If your application does not call cudaSetDevice(), the CUDA runtime should assign it to a free GPU. Since everyone will be accessing the GPUs through the batch system, there should be no over-subscription of the GPUs. | ||
+ | |||
+ | == Slurm Directives for A100 GPUs == | ||
+ | To use A100 GPUs for interactive sessions or batch jobs, please use one of the following options: | ||
+ | <pre> | ||
+ | --partition=gpu | ||
+ | --gpus=tesla:2 | ||
+ | </pre> | ||
+ | |||
+ | <pre> | ||
+ | --partition=hpg-ai | ||
+ | --gpus=2 | ||
+ | </pre> | ||
== Job Script Examples == | == Job Script Examples == |
Revision as of 18:16, 4 August 2021
Researchers may use GPUs in the form of Normalized Graphics Processor Units (NGUs), which include all of the infrastructure (memory, network, rack space, cooling), necessary for GPU-accelerated computation.
Groups that do not have GPU allocations can invest into GPUs by filling out the Purchase Form or by requesting a Trial Allocation if they never purchased GPUs and would like to try them out to see the benefit before purchasing.
GPU-enabled Services
We have two types of GPU services for two different kinds of applications.
Hardware Accelerated GUI
GPUs in these servers are used to accelerate rendering for graphical applications. These servers are in the SLURM "hwgui" partition. Refer to the Hardware Accelerated GUI Sessions page for more information on available resources and usage.
GPU Assisted Computation
A number of high performance applications installed on HiPerGator implement GPU-accelerated computing functions via CUDA to achieve significant speed-up over CPU implementations. These servers are in the SLURM "gpu" partition (--partition=gpu
).
Hardware Specifications for the GPU Partition
We have the following types of NVIDIA GPU nodes available in the "gpu" partition:
- NVIDIA K80s, with 2 GPUs per card and 2 cards per node. See technical specifications for reference.
- NVIDIA GeForce GTX 1080 Ti, with 1 GPU per card and 2 cards per node. See technical specifications for reference.
- NVIDIA GeForce RTX 2080Ti, with 1 GPU per card and 8 cards per node. See technical specifications for reference.
- NVIDIA Quadro RTX 6000, with 8 cards per node. These GPUs have SLI bridging See technical specifications for reference.
- AI NVIDIA DGX A100 SuperPod, with 8 cards per node. These GPUs have NVSWITCH interconnects See technical specifications for reference.
|
For a list of node features, and GPU name designations, see the Available Node Features page.
To select a specific type of GPU within a partition please use either a constraint for SLURM feature or a GRES with the needed GPU type.
Compiling CUDA Enabled Programs
To compile CUDA programs, please refer to the Nvidia CUDA Toolkit page. The current CUDA environment is cuda/10.
GPU Use Under Slurm
Policy
- GPUs are allocated only via the investment QOS.
- To increase the availability of GPU resources, the time limit for the gpu partition is 7-days (at most
#SBATCH --time=7-00:00:00
).
Interactive Access
In order to request interactive access to a GPU under SLURM, use commands similar to those that follow.
- To request access to one GPU (of any type) for a default 10-minute session:
srun -p gpu --gpus=1 --pty -u bash -i
- To request access to two Tesla GPUs on a single node for a 1-hour session:
srun -p gpu --nodes=1 --gpus=tesla:2 --time=01:00:00 --pty -u bash -i
- To request access to two GeForce GPUs on a single node for a 1-hour session:
srun -p gpu --nodes=1 --gpus=geforce:2 --time=01:00:00 --pty -u bash -i
Open On Demand Access
To access GPUs using Open-On-Demand, please check the form for your application. If your application supports multiple GPU types, use the GPU type to select between GPU types:
- To request access to one GPU (of any type, use this gres string):
gpu:1
- To request multiple GPUs (of any type, use this gres string were n is the number of GPUs you need):
gpu:n
- To request a specific type of GPU, use this gres string (requesting geforce GPUs in this example):
gpu:geforce:1
Batch Jobs
For batch jobs, to request GPU resources, use lines similar to the following in your submission script.
- In this example, two Tesla GPUs on a single server (--nodes defaults to "1") will be allocated to the job:
#SBATCH --partition=gpu #SBATCH --gpus=tesla:2
- In this example, two GeForce GPUs on a single server (--nodes defaults to "1") will be allocated to the job:
#SBATCH --partition=gpu #SBATCH --gpus=geforce:2
Alternatively, use '--gres=gpu:1
' or '--gres=gpu:geforce:1
' format. Note, if '--gpus=' format is used SLURM will not provide the data on GPU usage to slurmInfo and those GPUs will not be shown in slurmInfo output.
If no GPUs are available, your request will be queued and your connection established once the next GPU becomes available. Otherwise, you may cancel your job and try connecting again at a later time. If you have requested a longer time than is needed, please be sure to end your session so that the GPU will be available for other users.
Exclusive Mode
The GPUs are configured to run in exclusive mode. This means that the gpu driver will only allow one process at a time to access the GPU. If GPU 0 is in use and your application tries to use it, it will simply block. If your application does not call cudaSetDevice(), the CUDA runtime should assign it to a free GPU. Since everyone will be accessing the GPUs through the batch system, there should be no over-subscription of the GPUs.
Slurm Directives for A100 GPUs
To use A100 GPUs for interactive sessions or batch jobs, please use one of the following options:
--partition=gpu --gpus=tesla:2
--partition=hpg-ai --gpus=2
Job Script Examples
MPI Parallel
This is a sample script for MPI parallel VASP job requesting and using GPUs under SLURM:
#!/bin/bash #SBATCH --job-name=vasptest #SBATCH --output=vasp.out #SBATCH --error=vasp.err #SBATCH --mail-type=ALL #SBATCH --mail-user=email@ufl.edu #SBATCH --nodes=1 #SBATCH --ntasks=8 #SBATCH --cpus-per-task=1 #SBATCH --ntasks-per-node=8 #SBATCH --ntasks-per-socket=4 #SBATCH --mem-per-cpu=7000mb #SBATCH --distribution=cyclic:cyclic #SBATCH --partition=gpu #SBATCH --gres=gpu:geforce:4 #SBATCH --time=00:30:00 echo "Date = $(date)" echo "host = $(hostname -s)" echo "Directory = $(pwd)" module purge module load cuda/10.0.130 intel/2018 openmpi/4.0.0 vasp/5.4.4 T1=$(date +%s) srun --mpi=pmix_v3 vasp_gpu T2=$(date +%s) ELAPSED=$((T2 - T1)) echo "Elapsed Time = $ELAPSED"