|
|
(46 intermediate revisions by 4 users not shown) |
Line 1: |
Line 1: |
− | HiPerGator users may finely control the compute nodes requested by a given SLURM job (e.g. specific chip families, processor models) by using the <code>--constraint</code> directive to specify the node features they desire. Note that the partition must be selected if it's not default. | + | [[Category:Scheduler]] |
| + | |
| + | HiPerGator users may finely control selection of compute hardware for a SLURM job like specific processor families, processor models by using the <code>--constraint</code> directive to specify node ''features''. |
| | | |
| ;Example: | | ;Example: |
− | #SBATCH --partition=hpg1-compute
| |
| #SBATCH --constraint=westmere | | #SBATCH --constraint=westmere |
| + | or |
| + | #SBATCH --constraint=haswell |
| | | |
− | ==Using node features as job constraints==
| + | Basic boolean logic can be used to request combinations of features. For example, to request nodes that have Intel processors '''AND''' InfiniBand interconnect use |
− | ===Commonly constrained features===
| |
− | Use node features as SLURM job constraints.
| |
− | | |
− | A non-exhaustive list of commonly used feature constraints, found to be generally useful:
| |
− | {| class="wikitable"
| |
− | | align="center" style="background:#f0f0f0;"|'''Feature'''
| |
− | | align="center" style="background:#f0f0f0;"|'''Constraints'''
| |
− | | align="center" style="background:#f0f0f0;"|'''Description'''
| |
− | |-
| |
− | | Compute partition||<code>hpg1</code> , <code>hpg2</code>||''Requests nodes within a specified compute partition''
| |
− | |-
| |
− | | Chip family||<code>amd</code> , <code>intel</code>||''Requests nodes having processors of a specified chip vendor''
| |
− | |-
| |
− | | Chassis model||<code>c6145</code> , <code>sos6320</code>||''Requests nodes having a specified chassis model''
| |
− | |-
| |
− | | Processor model||<code>o6220</code> , <code>o3678</code> , <code>o4184</code>||''Requests nodes having a specified processor model''
| |
− | |-
| |
− | | Network fabric||<code>infiniband</code>||''Requests nodes having an Infiniband interconnect''
| |
− | |}
| |
− | | |
− | ;Examples:
| |
− |
| |
− | To request an Intel processor, use the following:
| |
− | | |
− | #SBATCH --contstraint=intel
| |
− | | |
− | To request nodes that have Intel processors '''AND''' InfiniBand interconnect:
| |
| | | |
| #SBATCH --constraint='intel&infiniband' | | #SBATCH --constraint='intel&infiniband' |
| | | |
− | To request nodes that have processors from the Intel Sandy Bridge '''OR''' Haswell CPU families: | + | To request processors from either Intel Haswell '''OR''' skylake CPU family use |
| | | |
− | #SBATCH --constraint='sandy-bridge|haswell' | + | #SBATCH --constraint='haswell|skylake' |
| | | |
− | ==Node features by partition== | + | ==All Node Features== |
− | HiPerGator node features are documented comprehensively below, sectioned by partition. Use the table columns headings within each section to sort by the criteria of your choice.
| + | You can run <code>nodeInfo</code> command from the ufrc environment module to list all available SLURM features. In addition, the table below shows automatically updated nodeInfo output as well as the corresponding CPU models. |
| | | |
− | This documentation will be updated periodically. To request current node feature information directly from the cluster, load the <code>ufrc</code> module and run the following command:
| + | {{#get_web_data:url=https://data.rc.ufl.edu/pub/ufrc/data/node_data.csv |
− | $ nodeInfo
| + | |format=CSV with header |
| + | |data=partition=Partition,ncores=NodeCores,sockets=Sockets,ht=HT,socketcores=SocketCores,memory=Memory,features=Features,cpumodel=CPU |
| + | |cache seconds=7200 |
| + | }} |
| + | {| class="wikitable sortable" border="1" cellspacing="0" cellpadding="2" align="center" style="border-collapse: collapse; margin: 1em 1em 1em 0; border-top: none; border-right:none; " |
| + | ! Partition |
| + | ! Cores per node |
| + | ! Sockets |
| + | ! Socket Cores |
| + | ! Threads/Core |
| + | ! Memory,GB |
| + | ! Features |
| + | ! CPU Model |
| + | {{#for_external_table:<nowiki/> |
| + | {{!}}- |
| + | {{!}} {{{partition}}} |
| + | {{!}} {{{ncores}}} |
| + | {{!}} {{{sockets}}} |
| + | {{!}} {{{socketcores}}} |
| + | {{!}} {{{ht}}} |
| + | {{!}} {{{memory}}} |
| + | {{!}} {{{features}}} |
| + | {{!}} {{{cpumodel}}} |
| + | }} |
| + | |} |
| | | |
− | ===hpg1-compute===
| + | '''Note''': See [[GPU_Access]] for more details on GPUs, such as available GPU memory. The following CPU models are in order from the oldest to the newest - interlagos, magny, sandy-bridge, dhabi, haswell, broadwell, skylake. The 'dhabi' and 'haswell' models are from HPG1 and HPG2 deployments. |
− | <div style="padding: 5px;">
| |
− | {| class="wikitable sortable"
| |
− | |-
| |
− | ! scope="col" | Chip Vendor
| |
− | ! scope="col" | Chassis Model
| |
− | ! scope="col" | Processor Family
| |
− | ! scope="col" | Processor Model
| |
− | ! scope="col" | Nodes
| |
− | ! scope="col" | Sockets
| |
− | ! scope="col" | CPUs
| |
− | ! scope="col" | RAM (GB)
| |
− | |-
| |
− | | AMD||c6145||dhabi||o6378||160||4||64||250
| |
− | |-
| |
− | | AMD||a2840||opteron||o6220||64||2||16||60
| |
− | |-
| |
− | | Intel||r2740||westmere||x5675||8||2||12||94
| |
− | |-
| |
− | | Intel||c6100||westmere||x5675||16||2||12||92
| |
− | |-
| |
− | |}
| |
− | </div>
| |
− | * Nodes in the <code>hpg1-compute</code> partition use the InfiniBand network fabric for distributed memory parallel processing and fast access to storage.
| |
− | ===hpg1-gpu===
| |
− | <div style="padding: 5px;">
| |
− | {| class="wikitable sortable"
| |
− | |-
| |
− | ! scope="col" | Chip Vendor
| |
− | ! scope="col" | Chassis Model
| |
− | ! scope="col" | Processor Family
| |
− | ! scope="col" | Processor Model
| |
− | ! scope="col" | Nodes
| |
− | ! scope="col" | Sockets
| |
− | ! scope="col" | CPUs
| |
− | ! scope="col" | RAM (GB)
| |
− | |-
| |
− | | AMD||-||opteron||o6220||14||2||16||29
| |
− | |-
| |
− | | Intel||sm-x9drg||sandy-bridge||e5-s2643||7||2||8||62
| |
− | |-
| |
− | |}
| |
− | </div>
| |
− | * Nodes in the <code>hpg1-gpu</code> partition are equipped with Nvidia Tesla M2090 GPU Computing Modules.
| |
− | * Nodes in the <code>hpg1-gpu</code> partition use the InfiniBand network fabric for distributed memory parallel processing and fast access to storage.
| |
− | ===hpg2-compute===
| |
− | <div style="padding: 5px;">
| |
− | {| class="wikitable sortable"
| |
− | |-
| |
− | ! scope="col" | Chip Vendor
| |
− | ! scope="col" | Chassis Model
| |
− | ! scope="col" | Processor Family
| |
− | ! scope="col" | Processor Model
| |
− | ! scope="col" | Nodes
| |
− | ! scope="col" | Sockets
| |
− | ! scope="col" | CPUs
| |
− | ! scope="col" | RAM (GB)
| |
− | |-
| |
− | | Intel||sos6320||haswell||e5-s2643||900||2||32||125
| |
− | |-
| |
− | |}
| |
− | </div>
| |
− | * Nodes in the <code>hpg2-compute</code> partition use the InfiniBand network fabric for distributed memory parallel processing and fast access to storage.
| |
− | ===hpg2-dev===
| |
− | <div style="padding: 5px;">
| |
− | {| class="wikitable sortable"
| |
− | |-
| |
− | ! scope="col" | Chip Vendor
| |
− | ! scope="col" | Chassis Model
| |
− | ! scope="col" | Processor Family
| |
− | ! scope="col" | Processor Model
| |
− | ! scope="col" | Nodes
| |
− | ! scope="col" | Sockets
| |
− | ! scope="col" | CPUs
| |
− | ! scope="col" | RAM (GB)
| |
− | |-
| |
− | | AMD||sm-h8qg6||dhabi||o6378||2||2||28||125
| |
− | |-
| |
− | | Intel||sos6320||haswell||e5-2698||4||2||28||125
| |
− | |-
| |
− | |}
| |
− | </div>
| |
− | * Nodes in the <code>hpg2-dev</code> partition use the InfiniBand network fabric for distributed memory parallel processing and fast access to storage.
| |
− | ===hpg2gpu===
| |
− | <div style="padding: 5px;">
| |
− | {| class="wikitable sortable"
| |
− | |-
| |
− | ! scope="col" | Chip Vendor
| |
− | ! scope="col" | Chassis Model
| |
− | ! scope="col" | Processor Family
| |
− | ! scope="col" | Processor Model
| |
− | ! scope="col" | Nodes
| |
− | ! scope="col" | Sockets
| |
− | ! scope="col" | CPUs
| |
− | ! scope="col" | RAM (GB)
| |
− | |-
| |
− | | Intel||r730||haswell||e5-2683||11||2||28||125
| |
− | |-
| |
− | |}
| |
− | </div>
| |
− | * Nodes in the <code>hpg2gpu</code> partition are equipped with Nvidia Tesla K80 GPU Computing Modules.
| |
− | * Nodes in the <code>hpg2gpu</code> partition use the InfiniBand network fabric for distributed memory parallel processing and fast access to storage.
| |
− | ===bigmem===
| |
− | <div style="padding: 5px;">
| |
− | {| class="wikitable sortable"
| |
− | |-
| |
− | ! scope="col" | Chip Vendor
| |
− | ! scope="col" | Chassis Model
| |
− | ! scope="col" | Processor Family
| |
− | ! scope="col" | Processor Model
| |
− | ! scope="col" | Nodes
| |
− | ! scope="col" | Sockets
| |
− | ! scope="col" | CPUs
| |
− | ! scope="col" | RAM (GB)
| |
− | |-
| |
− | | AMD||-||magny||o6174||1||4||48||496
| |
− | |-
| |
− | | Intel||-||nehalem||x7560||1||2||16||125
| |
− | |-
| |
− | | Intel||-||||e5-4607||1||4||24||750
| |
− | |-
| |
− | | Intel||-||||e7-8850||1||8||80||1009
| |
− | |-
| |
− | |}
| |
− | </div>
| |
− | ===gui===
| |
− | <div style="padding: 5px;">
| |
− | {| class="wikitable sortable"
| |
− | |-
| |
− | ! scope="col" | Chip Vendor
| |
− | ! scope="col" | Chassis Model
| |
− | ! scope="col" | Processor Family
| |
− | ! scope="col" | Processor Model
| |
− | ! scope="col" | Nodes
| |
− | ! scope="col" | Sockets
| |
− | ! scope="col" | CPUs
| |
− | ! scope="col" | RAM (GB)
| |
− | |-
| |
− | | Intel||sos6320||haswell||e5-2698||4||2||32||125
| |
− | |-
| |
− | |}
| |
− | </div>
| |
− | ===phase4===
| |
− | <div style="padding: 5px;">
| |
− | {| class="wikitable sortable"
| |
− | |-
| |
− | ! scope="col" | Chip Vendor
| |
− | ! scope="col" | Chassis Model
| |
− | ! scope="col" | Processor Family
| |
− | ! scope="col" | Processor Model
| |
− | ! scope="col" | Nodes
| |
− | ! scope="col" | Sockets
| |
− | ! scope="col" | CPUs
| |
− | ! scope="col" | RAM (GB)
| |
− | |-
| |
− | | AMD||c6105||libson||o4184||127||2||12||31
| |
− | |-
| |
− | |}
| |
− | </div>
| |
− | * Nodes in the <code>phase4</code> partition use the InfiniBand network fabric for distributed memory parallel processing and fast access to storage.
| |
HiPerGator users may finely control selection of compute hardware for a SLURM job like specific processor families, processor models by using the --constraint
directive to specify node features.
- Example
#SBATCH --constraint=westmere
or
#SBATCH --constraint=haswell
Basic boolean logic can be used to request combinations of features. For example, to request nodes that have Intel processors AND InfiniBand interconnect use
#SBATCH --constraint='intel&infiniband'
To request processors from either Intel Haswell OR skylake CPU family use
#SBATCH --constraint='haswell|skylake'
All Node Features
You can run nodeInfo
command from the ufrc environment module to list all available SLURM features. In addition, the table below shows automatically updated nodeInfo output as well as the corresponding CPU models.
Partition
|
Cores per node
|
Sockets
|
Socket Cores
|
Threads/Core
|
Memory,GB
|
Features
|
CPU Model
|
hpg-dev
|
64
|
8
|
8
|
1
|
500
|
hpg3;amd;milan;infiniband;el8
|
AMD EPYC 75F3 32-Core Processor
|
gui
|
32
|
2
|
16
|
1
|
124
|
gui;i21;intel;haswell;el8
|
Intel(R) Xeon(R) CPU E5-2698 v3 @ 2.30GHz
|
hwgui
|
32
|
2
|
16
|
1
|
186
|
hpg2;intel;skylake;infiniband;gpu;rtx6000;el8
|
Intel(R) Xeon(R) Gold 6242 CPU @ 2.80GHz
|
bigmem
|
128
|
8
|
16
|
1
|
4023
|
bigmem;amd;rome;infiniband;el8
|
AMD EPYC 7702 64-Core Processor
|
bigmem
|
192
|
4
|
24
|
2
|
1509
|
bigmem;intel;skylake;infiniband;el8
|
Intel(R) Xeon(R) Platinum 8168 CPU @ 2.70GHz
|
hpg-milan
|
64
|
8
|
8
|
1
|
500
|
hpg3;amd;milan;infiniband;el8
|
AMD EPYC 75F3 32-Core Processor
|
hpg-default
|
128
|
8
|
16
|
1
|
1003
|
hpg3;amd;rome;infiniband;el8
|
AMD EPYC 7702 64-Core Processor
|
hpg2-compute
|
32
|
2
|
16
|
1
|
124
|
hpg2;intel;haswell;infiniband;el8
|
Intel(R) Xeon(R) CPU E5-2698 v3 @ 2.30GHz
|
hpg2-compute
|
28
|
2
|
14
|
1
|
125
|
hpg2;intel;haswell;infiniband;el8
|
Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz
|
gpu
|
32
|
2
|
16
|
1
|
186
|
hpg2;intel;skylake;infiniband;gpu;2080ti;el8
|
Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz
|
gpu
|
128
|
8
|
16
|
1
|
2010
|
ai;su3;amd;rome;infiniband;gpu;a100;el8
|
AMD EPYC 7742 64-Core Processor
|
hpg-ai
|
128
|
8
|
16
|
1
|
2010
|
ai;su3;amd;rome;infiniband;gpu;a100;el8
|
AMD EPYC 7742 64-Core Processor
|
Note: See GPU_Access for more details on GPUs, such as available GPU memory. The following CPU models are in order from the oldest to the newest - interlagos, magny, sandy-bridge, dhabi, haswell, broadwell, skylake. The 'dhabi' and 'haswell' models are from HPG1 and HPG2 deployments.