Getting Started

From UFRC
Jump to navigation Jump to search

Getting an account

To get an account at the UF HPC Center, you need to put a request in at our request page. That page is located here.

Logging In

To login to the cluster, you need an SSH client of some sort. If you are using a linux or unix based system, there is most likely one already available to you in a shell, and you can get to your account very quickly.

If you are having problems connecting to your account, please let the HPC Staff know by submitting a Bugzilla Request.

Linux / Unix

So, here is how you would go about logging in via a linux or unix account:

test@puppy:~$ ssh test@submit.hpc.ufl.edu
test@submit.hpc.ufl.edu's password:
Last login: Fri Feb  9 00:03:38 2007 from wsip-70-168-187-166.ga.at.cox.net
[test@submit ~]$

The command ssh test@submit.hpc.ufl.edu is what you would type in at a command prompt on your system. After this, it asks you for a password, which you type in. After that, you are logged in and ready to work.

Windows

For Microsoft Windows, things are a bit trickier. Windows does not come with a built-in SSH client, and this makes things difficult. What you have to do is download a client from the Internet and install it, then use that client. We recommend the following:

  • For a shell client, Putty
  • For a file transfer client, WinSCP

Both of the above clients have documentation at their websites, so I will not go into it here. Once you are logged in and have a prompt that resembles that in the Unix / Linux section above, you can continue with this tutorial.

Passwords

The passwords assigned to users at the HPC Center expire every six months. If you receive a message stating that it is about to expire, you cna reset it by using the following commands:

$ passwd
Enter new UNIX password:
Retype new UNIX password:

As you can see above, the system will ask for the password twice as it wants to confirm that you have typed it in correctly. Nothing will be displayed when you are typing the actual password.

Changing your Password

In order to change your password to something perhaps a little easier to remember than the password supplied, which typically looks like a line of static, you can do the following:

$ passwd
Changing password for user <YOUR USERNAME>.
Enter login(LDAP) password: 
New UNIX password: 
Retype new UNIX password: 
New password: 
Re-enter new password: 
LDAP password information changed for <YOUR USERNAME>
passwd: all authentication tokens updated successfully.

The above asks for your old password, then asks for your new password a total of four times. This seems like a lot, but it is actually a nice thing because it does the following:

  • Makes sure that you are entering your new password correctly.
  • Does a bit of training your fingers for muscle memory.

Looking Around

We expect the users of the HPC Center cluster to already have a working knowledge of the linux operating system, so we are not going to go into detail here on using the operating system. Below are some links to webpages that describe a lot of this, however:

File System

We have a structured file system that is important to know about. Please read about it here: HPC File System

Editing

Editing files on the cluster can be done through a couple of different methods...

In-System Editors

  • vi - vi is the basic editor for a number of people. Using the editor is not necessarily intuitive, so we have provided a link to a tutorial.
  • emacs - emacs is a much heavier duty editor, but again has the problem of having commands that are non-intuitive. Again, we have provided a link to a tutorial for this editor.

External Editors

You can also use your favorite editor on your local machine and then transfer the files over to the HPC Cluster afterwards. One caveat to this is that with files created on Windows machines, very often extra characters get injected into text files which can cause major problems when it comes time to interpret the files. If this happens, there is a utility called dos2unix that you can use to remove the extra characters.

Running Jobs

General Scheduling

Jobs from faculty investors in the HPC Center are now favored over jobs from groups who did not invest in the HPC Center.

Job scheduling has been a big topic with the HPC committee in the last several months. The HPC Center staff has been directed by the committee to improve the quality of service of job scheduling for jobs coming from investors in the HPC Center. This means reducing the time spent in the queues and allowing jobs from the investors to capture the full share of the resources that they have paid for. The HPC committee recently adopted a document which spells out what they want.

Jobs can be submitted on submit.hpc.ufl.edu.

Torque Scheduler

The Torque Resource Manager has been installed on the HPC cluster, and is slowly being switched over to from the PBS Pro scheduler as time goes on. The Maui scheduler is also running in this environment.

The Torque scheduler is installed on iogw2.hpc.ufl.edu, and accepts the same commands as the PBS Pro scheduler with a couple of exceptions. Currently we are experimenting with these packages so that we can provide improved scheduling to HPC users. While we are still learning about Torque and Maui, our experiences so far have been good and we are guardedly optimistic that Torque and Maui will end up being the resource manager and scheduler for the HPC Center sometime in the not-too-distant future.

Please note the following.

  • If your job is single-threaded (1 cpu) and does not have heavy I/O requirements, it does not need to run on an infiniband-enabled node. In that case, you should include the "gige" property in your PBS resource specification as follows
#PBS  -l nodes=1:ppn=1:gige
  • If you need to run an mpi-based application that has not been rebuilt for OpenMPI 1.2.0+Torque, please send us a note and we'll be happy to rebuild what you need - first come, first serve.
  • If you build your own MPI-based application executables, you should use the MPI compiler wrappers (mpif90, mpicc, mpiCC) in /opt/intel/ompi/1.2.0/bin. These wrappers will automatically pull in the correct libraries.
  • We will continue to tune the maui scheduler to provide fair and efficient scheduling according to the policies established by the HPC Committee and within the capabilities of the maui scheduler. Keep in mind that these policies include priority and quality-of-service commitments to those faculty who have invested in the resources within the HPC Center.

Altix

The Altix is known to the Torque batch system, but not to PBSPro. To submit jobs to the Altix via the batch system, you will have to log into the host iogw2.hpc.ufl.edu and submit your jobs from there.

Trivial Example

#! /bin/sh
#PBS -N testjob
#PBS -o testjob.out
#PBS -e testjob.err
#PBS -M mxcheng@ufl.edu
#PBS -l walltime=00:01:00
#PBS -l nodes=1:ppn=1

date
hostname

To submit this job from submit.hpc.ufl.edu, you would use the following command:

$ qsub <your job script>

To check the status of running jobs, you would use the following command:

$ qstat [-u <username>]

or HPC --> Utilization --> Torque Queue Status