Large-Memory SMP Servers

From UFRC
Revision as of 16:15, 6 November 2011 by Taylor (talk | contribs) (Created page with "The HPC Center within Research Computing currently maintains the following resources for calculations requiring large amounts of physical memory. # 21 Intel (E5462, 8x 2.8 GHz c...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

The HPC Center within Research Computing currently maintains the following resources for calculations requiring large amounts of physical memory.

  1. 21 Intel (E5462, 8x 2.8 GHz cores) servers with 64 GB of RAM (physical memory)
  2. 1 Intel (X7560, 16x 2.2 GHz cores) server with 128 GB of RAM.
  3. 1 AMD (Opteron 6174, 48x 2.2 GHz cores) server with 512 GB of RAM

The 128 GB and 512 GB machines are available only via a dedicated queue (bigmem). Before attempting to run jobs via the bigmem queue, you should request access to the bigmem queue via a support request (http://support.hpc.ufl.edu).

Note that memory is no less a consumable resource than processors. Therefore, all resource requests are considered on the basis of "processor equivalents" as opposed to just processors. This mechanism normalizes large-memory jobs to an equivalent number of CPUs (processor equivalants, PE) so that NCU allocations may be enforced.

PE = MAX(ProcsRequestedByJob / TotalConfiguredProcs, MemoryRequestedByJob / TotalConfiguredMemory) * Total- ConfiguredProcs

The total number of configured processors and the total amount of configured memory vary slightly as machines are added to and removed from the batch system but, in general, each is fairly constant. This means that jobs requesting only a single processor (core) but, say, 148 GB of RAM will be converted to an equivalent number or processors. If the PE for your job exceeds your group's NCU allocation, your job will not run.

Once submitted, you may see the PE value assigned to your job via "checkjob <job id>".