Difference between revisions of "Parallel Computing"

From UFRC
Jump to navigation Jump to search
(Created page with " {|align=right |__TOC__ |} ==Parallel Computing== Parallel computing refers to running multiple computational tasks simultaneously. The idea behind it is based on the assu...")
 
Line 5: Line 5:
 
==Parallel Computing==
 
==Parallel Computing==
 
Parallel computing refers to running multiple computational tasks simultaneously. The idea behind it is based on the assumption that a big computational task can be divided into smaller tasks which can run concurrently.
 
Parallel computing refers to running multiple computational tasks simultaneously. The idea behind it is based on the assumption that a big computational task can be divided into smaller tasks which can run concurrently.
 +
 +
=== Types of parallel computing ===
 +
 +
Parallel computing is used only for the last row of below table;
  
 
{| class="wikitable"
 
{| class="wikitable"
Line 16: Line 20:
 
|}
 
|}
  
=== Types of parallel computing ===
+
In more details;
  
Parallel computing is used only for the last row of below table;
+
* Data-parallel(SIMD): Same operations/instructions are carried out on different data items, simultaneously.
 +
* Task Parallel(MIMD): Different instructions on different data carried out concurrently.
 +
* SPMD: Single program, multiple data, not synchronized at individual operation level
  
===Windows===
+
SPMD and MIMD are essentially the same because any MIMD can be made SPMD. SIMD is also equivalent, but in a less practical sense. MPI (Message Passing Interface) is primarily used for SPMD/MIMD.
Microsoft Windows does not come with a built-in SSH client. You have to download a client from the web. We recommend the following software:
 
* SSH client - [http://www.chiark.greenend.org.uk/~sgtatham/putty Putty]
 
** [[PuTTY|configuration instructions for UFRC]]
 
* Graphical file transfer clients:
 
** [http://filezilla-project.org/download.php?type=client FileZilla]. See our [[FileZilla|tutorial on transferring files to UFRC using FileZilla]].
 
** [http://winscp.net/eng/index.php WinSCP]
 
===MacOS===
 
For MacOS users, the connection instructions are very similar to those for Linux/Unix users.
 
  
''Terminal'', the terminal emulation application under MacOS is located in Applications/Utilities.
+
== Shared Memory vs. Distributed Memory ==
  
Both [http://filezilla-project.org/download.php?type=client FileZilla] and [http://cyberduck.ch/ Cyberduck] are available for MacOS if you prefer a graphical interface for transferring files.
+
=== Shared Memory ===
  
==Running Graphical Programs==
+
Shared memory is the memory which all the processors can access. In hardware point of view it means all the processors have direct access to the common physical memory through bus based (usually using wires) access. These processors can work independently while they all access the same memory. Any change in the variables stored in the memory is visible by all processors because at any given moment all they see is a copy or picture of entire variables stored in the memory and they can directly address and access the same logical memory locations regardless of where the physical memory actually exists.
See the [[GUI_Programs|Gui Programs]] page for information on running graphical user interface applications at UFRC.
 
  
==Getting Help==
+
Uniform Memory Access (UMA):
If you are having problems connecting to the UFRC system, please let the UFRC Staff know by submitting a [http://support.rc.ufl.edu Support Request].
 
  
==Interactive work under Linux==
+
* Most commonly represented today by Symmetric Multiprocessor (SMP) machines
Once you are logged in to a Research Computing server, you will find yourself at a Linux command line prompt. That may be daunting at first. However, you only need to know a small subset of Linux commands to accomplish most tasks. There are many Linux "Getting Started" guides online and in print.  Below are just a few possibilities.  Many more are easily found via a Google search.
+
* Identical processors
 +
* Equal access and access times to memory
 +
* Sometimes called CC-UMA - Cache Coherent UMA. Cache coherent means if one processor updates a location in shared memory, all the other processors know about the update. Cache coherency is accomplished at the hardware level.
  
* [http://www.tldp.org/LDP/intro-linux/html/ Introduction to Linux]
 
* [http://www.linux.org/tutorial/view/beginners-level-course Linux for Beginners]
 
* [http://www.tutorialized.com/tutorial/Basic-Linux-Shell-commands/21596 Basic Linux Shell Commands]
 
* [[Getting_Started_on_Linux]]
 
  
==A Few Basic Commands==
 
While it is advantageous to have a working knowledge of the most common Linux commands, it is not a requirement.  For the uninitiated, the following information may be useful as well as a good "Introduction to Using Linux" book.
 
  
{| border=1
+
Non-Uniform Memory Access (NUMA):
|-
 
! Command !! Description
 
|-
 
| ls || List files in the current directory
 
|-
 
| cd || Change directory
 
|-
 
| more || View a file's contents
 
|-
 
| mkdir <dir> || Create a directory
 
|-
 
| cp file1 file2 || Copy a file
 
|-
 
| mv file1 file2 || Move (i.e. rename) a file
 
|-
 
| rm file || Delete a file
 
|-
 
| rmdir dir || Delete an ''empty'' directory
 
|}
 
 
 
==Editing==
 
Editing files on the cluster can be done through a couple of different methods...
 
===Native Editors===
 
* '''vi''' - The visual editor (vi) is the traditonal Unix editor. However, it is not necessarily the most intuitive editor.  That being the case, if you are unfamiliar with it, the following tutorial may be useful.
 
** [http://www.eng.hawaii.edu/Tutor/vi.html VI Tutorial]
 
** Another resource for vi is [[vi | right here]] on our wiki.
 
** There is also a vi tutorial, '''vimtutor'''. Once logged in, simply type "<code>vimtutor</code>" at the command line to start the tutorial.
 
* '''emacs''' - Emacs is a much heavier duty editor, but again has the problem of having commands that are non-intuitive. Again, we have provided a link to a tutorial for this editor.
 
** [http://www2.lib.uchicago.edu/~keith//tcl-course/emacs-tutorial.html Emacs Tutorial]
 
* '''pico''' - While pico is not installed on the system, [[nano]] is installed, and is a pico work-a-like.
 
* '''nano''' - Nano has a good bit of on-screen help to make it easier to use.
 
  
===External Editors===
+
* Often made by physically linking two or more SMPs
You can also use your favorite editor on your local machine and then transfer the files over to the Research Computing afterwards. One caveat to this is that with files created on Windows machines, usually contain unprintable characters which may be misinterpreted by Linux command interpreters (shells). If this happens, there is a utility called <code>dos2unix</code> that you can use to convert the text file from DOS/Windows formatting to Linux formatting.
+
* One SMP can directly access memory of another SMP
 
+
* Not all processors have equal access time to all memories
==Using Installed Software==
+
* Memory access across link is slower
 
+
* If cache coherency is maintained, then may also be called CC-NUMA - Cache Coherent NUMA
We use [[Modules|Environment Modules]] to provide access to the installed software. Read about the [[Modules_Basic_Usage|basic usage of environment modules]] for information on loading software.
 
 
 
==Running Jobs==
 
 
 
====Trivial Example====
 
{|
 
|-valign="top"
 
|
 
<pre>
 
#! /bin/sh
 
#PBS -N testjob
 
#PBS -o testjob.out
 
#PBS -e testjob.err
 
#PBS -M <INSERT EMAIL HERE>
 
#PBS -r n
 
#PBS -l walltime=00:01:00
 
#PBS -l nodes=1:ppn=1
 
#PBS -l pmem=100mb
 
 
 
date
 
hostname
 
 
 
module load python
 
python -V
 
</pre>
 
||
 
To submit this job from gator.hpc.ufl.edu, you would use the following command:
 
<pre>
 
$ qsub <your job script>
 
</pre>
 
To check the status of running jobs, you would use the following command:
 
<pre>
 
$ qstat [-u <username>]
 
</pre>
 
|}
 
  
====DRMAA====
 
It's possible to submit jobs by using the scheduler API via the DRMAA library. A python example that uses our installed Python DRMAA library can be found in [[Example_pbs-drmaa_python_script]].
 
  
====Notes on Batch Scripts====
+
=== Distributed Memory ===
* The script can handle only one set of directives. '''Do not submit a script that has more than one set of directives''' included in it.
 
  
* For more info on '''advanced directives''' see [[PBS_Directives]].
+
Distributed memory in hardware sense, refers to the case where the processors can access other processor's memory only through network. In software sense, it means each processor only can see local machine memory directly and should use communications through network to access memory of the other processors.
  
* Please see our [[Batch Script Explanation | annotated submission script]].
 
  
===Job Status===
+
=== Hybrid ===
You can view the job status of your jobs and your group's jobs via the [http://rc.ufl.edu/jobstatus/ PBS Job Status] page as described in [[PBS_Job_Status|PBS Job Status Documentation]]
 
===Scratch Storage===
 
See [[Scratch]] for notes on the main high-performance storage.
 
  
===Developmental Nodes===
+
Combination of the two kinds of memory is what usually is used in today's fast supercomputers. The hybrid memory system is basically a network of shared memories. Within each shades component, the memory is accessible to all the cpus, and in addition, they can access the tasks and information stored on other units through the network.
See [[Test_Nodes]] to read about the available interactive servers.
 

Revision as of 21:56, 27 February 2015

Parallel Computing

Parallel computing refers to running multiple computational tasks simultaneously. The idea behind it is based on the assumption that a big computational task can be divided into smaller tasks which can run concurrently.

Types of parallel computing

Parallel computing is used only for the last row of below table;

Single Instruction Multiple Instructions Single Program Multiple Programs
Single Data SISD MISD
Multiple Data SIMD MIMD SPMD MPMD

In more details;

  • Data-parallel(SIMD): Same operations/instructions are carried out on different data items, simultaneously.
  • Task Parallel(MIMD): Different instructions on different data carried out concurrently.
  • SPMD: Single program, multiple data, not synchronized at individual operation level

SPMD and MIMD are essentially the same because any MIMD can be made SPMD. SIMD is also equivalent, but in a less practical sense. MPI (Message Passing Interface) is primarily used for SPMD/MIMD.

Shared Memory vs. Distributed Memory

Shared Memory

Shared memory is the memory which all the processors can access. In hardware point of view it means all the processors have direct access to the common physical memory through bus based (usually using wires) access. These processors can work independently while they all access the same memory. Any change in the variables stored in the memory is visible by all processors because at any given moment all they see is a copy or picture of entire variables stored in the memory and they can directly address and access the same logical memory locations regardless of where the physical memory actually exists.

Uniform Memory Access (UMA):

  • Most commonly represented today by Symmetric Multiprocessor (SMP) machines
  • Identical processors
  • Equal access and access times to memory
  • Sometimes called CC-UMA - Cache Coherent UMA. Cache coherent means if one processor updates a location in shared memory, all the other processors know about the update. Cache coherency is accomplished at the hardware level.


Non-Uniform Memory Access (NUMA):

  • Often made by physically linking two or more SMPs
  • One SMP can directly access memory of another SMP
  • Not all processors have equal access time to all memories
  • Memory access across link is slower
  • If cache coherency is maintained, then may also be called CC-NUMA - Cache Coherent NUMA


Distributed Memory

Distributed memory in hardware sense, refers to the case where the processors can access other processor's memory only through network. In software sense, it means each processor only can see local machine memory directly and should use communications through network to access memory of the other processors.


Hybrid

Combination of the two kinds of memory is what usually is used in today's fast supercomputers. The hybrid memory system is basically a network of shared memories. Within each shades component, the memory is accessible to all the cpus, and in addition, they can access the tasks and information stored on other units through the network.