Difference between revisions of "Nvidia CUDA Toolkit"

From UFRC
Jump to navigation Jump to search
(31 intermediate revisions by 6 users not shown)
Line 1: Line 1:
[[Category:Software]]
+
[[Category:Software]][[Category:Programming]][[Category:Library]][[Category:Graphics]][[Category:GPU]]
 
{|<!--CONFIGURATION: REQUIRED-->
 
{|<!--CONFIGURATION: REQUIRED-->
 
|{{#vardefine:app|cuda}}
 
|{{#vardefine:app|cuda}}
Line 19: Line 19:
 
CUDA™ is a parallel computing platform and programming model invented by NVIDIA. It enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU). With millions of CUDA-enabled GPUs sold to date, software developers, scientists and researchers are finding broad-ranging uses for GPU computing with CUDA.
 
CUDA™ is a parallel computing platform and programming model invented by NVIDIA. It enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU). With millions of CUDA-enabled GPUs sold to date, software developers, scientists and researchers are finding broad-ranging uses for GPU computing with CUDA.
 
<!--Modules-->
 
<!--Modules-->
==Required Modules==
+
==Environment Modules==
cuda
+
Use the 'module avail' command after loading a cuda environment module to see the available module trees or see which compiler and openmpi modules require the cuda module to be loaded.
 +
 
 
==System Variables==
 
==System Variables==
* HPC_{{#uppercase:{{#var:app}}}}_DIR
+
* HPC_{{uc:{{#var:app}}}}_DIR
* HPC_{{#uppercase:{{#var:app}}}}_BIN
+
* HPC_{{uc:{{#var:app}}}}_BIN
* HPC_{{#uppercase:{{#var:app}}}}_INC
+
* HPC_{{uc:{{#var:app}}}}_INC
* HPC_{{#uppercase:{{#var:app}}}}_LIB
+
* HPC_{{uc:{{#var:app}}}}_LIB
 
<!--Configuration-->
 
<!--Configuration-->
 
{{#if: {{#var: conf}}|==Configuration==
 
{{#if: {{#var: conf}}|==Configuration==
Line 31: Line 32:
 
|}}
 
|}}
 
<!--Run-->
 
<!--Run-->
==Available GPUs==
+
==Program Development==
Research Computing has a significant investment in GPU-enabled servers. Each supports from two to eight Nvidia GPUs (see table below).
 
 
 
===Available GPUs under Torque/Moab on HPG1===
 
{| border=1
 
!GPU!!Quantity!!Host Quantity!!Host Architecture!!Host Memory!!Host Interconnect!!Host Attributes!!Notes
 
|-
 
|M2070||8||4||Intel E5675||24 GB||QDR IB||fermi,m2070 || cobra[1-4]
 
|-
 
|M2070||8||1||Intel E5620||24 GB||GigE||fermi,m2070 || vette
 
|-
 
|M2090||4||1||Intel E5-2643||64 GB||FDR IB||fermi,m2090 ||
 
|-
 
|M2090||14||7||AMD Opteron 6220||32 GB||QDR IB||fermi,m2090 ||
 
|-
 
|}
 
  
===Available GPUs under SLURM on HPG1===
+
===Environment===
{| border=1
+
For CUDA development please load the "cuda" module.  Doing so will ensure that your environment is set up correctly for the use of the CUDA compiler, header files, and libraries. Currently cuda/9.2.88 and cuda/10.0.130 are the only versions supported on hipergator.
!GPU!!Quantity!!Host Quantity!!Host Architecture!!Host Memory!!Host Interconnect!!Host Attributes!!Notes
 
|-
 
|M2090||26||13||AMD Opteron 6220||32 GB||QDR IB||fermi,m2090 ||
 
|-
 
|}
 
  
===Available GPUs under SLURM on HPG2===
 
{| border=1
 
!GPU!!Quantity!!Host Quantity!!Host Architecture!!Host Memory!!Host Interconnect!!Host Attributes!!Notes
 
|-
 
|Tesla K80||32||8||INTEL E5-2683||132 GB||QDR IB||fermi,m2090 ||
 
|-
 
|}
 
 
==Usage Policy==
 
===Interactive Use===
 
 
If you need interactive access to a gpu for development and testing you may do so by requesting an interactive session through the batch system. 
 
 
In order to gain interactive access to a GPU server you should run similar to the one that follows.
 
====Under SLURM====
 
 
To access 1 GPU for a default 10-minute session:
 
 
<pre>
 
<pre>
srun -p hpg1-gpu --pty -u bash -i
 
</pre>
 
 
OR
 
<pre>
 
srun -p hpg2-gpu --pty -u bash -i
 
</pre>
 
 
To access  2 Tesla GPUs on one node for a 1-hour session:
 
 
<pre>
 
srun -p hpg1-gpu --gres=gpu:tesla:2 --time=01:00:00  --pty -u bash -i 
 
</pre>
 
 
OR
 
<pre>
 
srun -p hpg2-gpu --gres=gpu:tesla:2 --time=01:00:00  --pty -u bash -i 
 
</pre>
 
 
If a gpu is available, you will get a prompt on a gpu-enabled host within a minute or two.  Otherwise, you will have to wait or try another time.  If you choose to wait, you will be connected when a gpu is available.    The default walltime limit for the gpu queue is 10 minutes.  You should request the amount of time you need but be sure to log out and end your session when you are finished so that the GPU will be available to others.
 
 
===Batch Jobs===
 
 
For batch jobs, to access a host with Tesla GPUs, you can add the following to your submission script.
 
 
Under SLURM:
 
<pre>
 
#SBATCH --partition=hpg1-gpu
 
#SBATCH --gres=gpu:tesla:2
 
</pre>
 
 
OR
 
<pre>
 
#SBATCH --partition=hpg2-gpu
 
#SBATCH --gres=gpu:tesla:2
 
</pre>
 
 
===Exclusive Mode===
 
The GPUs are configured to run in '''exclusive''' mode.  This means that the gpu driver will only allow one process at a time to access the GPU.  If GPU 0 is in use and your application tries to use it, it will simply block.  If your application does not call cudaSetDevice(), the CUDA runtime should assign it to a free GPU.  Since everyone will be accessing the GPUs through the batch system, there should be no over-subscription of the GPUs.
 
 
==Environment==
 
For CUDA development please load the "cuda" module.  Doing so will ensure that your environment is set up correctly for the use of the CUDA compiler, header files, and libraries.
 
 
<source lang=bash>
 
 
$ module spider cuda
 
$ module spider cuda
--------------------------------------------------------
+
-------------------------------------------------------------
  cuda:
+
cuda:
--------------------------------------------------------
+
-------------------------------------------------------------
 
     Description:
 
     Description:
 
       NVIDIA CUDA Toolkit
 
       NVIDIA CUDA Toolkit
  
 
     Versions:
 
     Versions:
         cuda/4.2
+
         cuda/9.2.88
         cuda/5.5
+
         cuda/10.0.130
         cuda/7.0
+
          
  
----------------------------------------------------------
+
--------------------------------------------------------------------------------------------------------------------
    
+
   For detailed information about a specific "cuda" module (including how to load the modules) use the module full name.
For detailed information about a specific cuda module (including how to load the modules) use the module full name.
 
 
   For example:
 
   For example:
  
     $ module spider cuda/7.0
+
     $ module spider cuda/10.0.130
----------------------------------------------------------
+
--------------------------------------------------------------------------------------------------------------------
$ module load cuda/7.0
+
 
 +
$ module load cuda/10.0.130
  
 
$ which nvcc
 
$ which nvcc
/apps/cuda/7.0/bin/nvcc
+
/apps/compilers/cuda/10.0.130/bin/nvcc
  
 
$ printenv | grep CUDA
 
$ printenv | grep CUDA
HPC_CUDA_LIB=/apps/cuda/7.0/lib64
+
HPC_CUDA_LIB=/apps/compilers/cuda/10.0.130/lib64
HPC_CUDA_DIR=/apps/cuda/7.0
+
HPC_CUDA_DIR=/apps/compilers/cuda/10.0.130
HPC_CUDA_BIN=/apps/cuda/7.0/bin
+
HPC_CUDA_BIN=/apps/compilers/cuda/10.0.130/bin
HPC_CUDA_INC=/apps/cuda/7.0/include
+
HPC_CUDA_INC=/apps/compilers/cuda/10.0.130/include
</source>
+
UFRC_FAMILY_CUDA_VERSION=10.0.130
 +
</pre>
 +
 
 +
 
 +
===Selecting CUDA Arch Flags===
 +
When compiling with NVCC, you need to specify the Nvidia architecture that the CUDA files will be compiled for. Please refer to [https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#gpu-feature-list GPU Feature List] for CUDA naming scheme sm_xy where x denotes the GPU generation and y denotes the version. The table below lists the SM flags for the three types of GPUs on HiPerGator.
 +
 
 +
{| class="wikitable"
 +
|-
 +
! SM !! Nvidia Cards
 +
|-
 +
| SM_37 || Tesla K80 (No longer available)
 +
|-
 +
| SM_61 || GeForce GTX 1080Ti
 +
|-
 +
| SM_75|| GeForce RTX 2080Ti
 +
|-
 +
| SM_80 || DGX A100
 +
|}
  
 
==Sample GPU Batch Job Scripts==
 
==Sample GPU Batch Job Scripts==
Line 154: Line 92:
  
 
See the [[Example_SLURM-GPU-Job-Scripts]] page for an example.
 
See the [[Example_SLURM-GPU-Job-Scripts]] page for an example.
 
===CUDA Examples===
 
See the [[{{PAGENAME}}_Examples]] page for CUDA development examples.
 
  
 
<!--|}}-->
 
<!--|}}-->

Revision as of 21:37, 23 September 2022

Description

cuda website  
CUDA™ is a parallel computing platform and programming model invented by NVIDIA. It enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU). With millions of CUDA-enabled GPUs sold to date, software developers, scientists and researchers are finding broad-ranging uses for GPU computing with CUDA.

Environment Modules

Use the 'module avail' command after loading a cuda environment module to see the available module trees or see which compiler and openmpi modules require the cuda module to be loaded.

System Variables

  • HPC_CUDA_DIR
  • HPC_CUDA_BIN
  • HPC_CUDA_INC
  • HPC_CUDA_LIB

Program Development

Environment

For CUDA development please load the "cuda" module. Doing so will ensure that your environment is set up correctly for the use of the CUDA compiler, header files, and libraries. Currently cuda/9.2.88 and cuda/10.0.130 are the only versions supported on hipergator.

$ module spider cuda
-------------------------------------------------------------
cuda:
-------------------------------------------------------------
    Description:
      NVIDIA CUDA Toolkit

     Versions:
        cuda/9.2.88
        cuda/10.0.130
        

--------------------------------------------------------------------------------------------------------------------
  For detailed information about a specific "cuda" module (including how to load the modules) use the module full name.
  For example:

     $ module spider cuda/10.0.130
--------------------------------------------------------------------------------------------------------------------

$ module load cuda/10.0.130

$ which nvcc
/apps/compilers/cuda/10.0.130/bin/nvcc

$ printenv | grep CUDA
HPC_CUDA_LIB=/apps/compilers/cuda/10.0.130/lib64
HPC_CUDA_DIR=/apps/compilers/cuda/10.0.130
HPC_CUDA_BIN=/apps/compilers/cuda/10.0.130/bin
HPC_CUDA_INC=/apps/compilers/cuda/10.0.130/include
UFRC_FAMILY_CUDA_VERSION=10.0.130


Selecting CUDA Arch Flags

When compiling with NVCC, you need to specify the Nvidia architecture that the CUDA files will be compiled for. Please refer to GPU Feature List for CUDA naming scheme sm_xy where x denotes the GPU generation and y denotes the version. The table below lists the SM flags for the three types of GPUs on HiPerGator.

SM Nvidia Cards
SM_37 Tesla K80 (No longer available)
SM_61 GeForce GTX 1080Ti
SM_75 GeForce RTX 2080Ti
SM_80 DGX A100

Sample GPU Batch Job Scripts

SLURM Job Scripts

See the Example_SLURM-GPU-Job-Scripts page for an example.