Difference between revisions of "SAS"

From UFRC
Jump to navigation Jump to search
Line 1: Line 1:
 +
__NOTOC__
 +
__NOEDITSECTION__
 
[[Category:Software]][[Category:Statistics]]
 
[[Category:Software]][[Category:Statistics]]
==Location==
+
<!-- ########  Template Configuration ######## -->
SAS is installed in /apps/sas
+
{|
 
+
<!--Main settings - REQUIRED-->
There are currently three different versions installed:
+
|{{#vardefine:app|sas}}
 
+
|{{#vardefine:url|http://www.sas.com/}}
* SAS 9.2
+
<!--Compiler and MPI settings - OPTIONAL -->
* SAS 9.3
+
|{{#vardefine:intel|}} <!-- E.g. "11.1" -->
 
+
|{{#vardefine:mpi|}} <!-- E.g. "openmpi/1.3.4" -->
==Commands==
+
<!--Choose sections to enable - OPTIONAL-->
 +
|{{#vardefine:mod|1}} <!--Present instructions for running the software with modules -->
 +
|{{#vardefine:exe|1}} <!--Present manual instructions for running the software -->
 +
|{{#vardefine:conf|}} <!--Enable config wiki page link - {{#vardefine:conf|1}} = ON/conf|}} = OFF-->
 +
|{{#vardefine:pbs|}} <!--Enable PBS script wiki page link-->
 +
|{{#vardefine:policy|}} <!--Enable policy section -->
 +
|{{#vardefine:testing|}} <!--Enable performance testing/profiling section -->
 +
|{{#vardefine:faq|}} <!--Enable FAQ section -->
 +
|{{#vardefine:citation|}} <!--Enable Reference/Citation section -->
 +
|}
 +
<!-- ########  Template Body ######## -->
 +
<!--Description-->
 +
{{#if: {{#var: url}}|
 +
{{App_Description|app={{#var:app}}|url={{#var:url}}}}|}}
 +
SAS is a commercial integrated system for statistical analysis, data mining, and graphics as well as many enterprise oriented additional features. SAS cost and the breadth of SAS features means that both a significant monetary investment and a substantial time investment are required to master it. [http://support.sas.com/documentation/onlinedoc/stat/index.html#stat93 SAS 9.3 Documentation] is vast. For research purposes the [http://support.sas.com/documentation/cdl/en/statug/63962/HTML/default/viewer.htm#titlepage.htm SAS 9.3 User's Guide] is a thorough reference for the functions and procedures you may need to do the statistical analysis using SAS.
 +
<!--Location-->
 +
{{App_Location|app={{#var:app}}|{{#var:ver}}}}
 +
==Available versions==
 +
* 9.2.
 +
* 9.3 (default, supported).
 +
<!-- -->
 +
{{#if: {{#var: mod}}|==Running the application using modules==
 +
{{App_Module|app={{#var:app}}|intel={{#var:intel}}|mpi={{#var:mpi}}}}|}}
 +
{{#if: {{#var: exe}}|==How To Run==
 +
===Commands===
 
SAS has a number of options:
 
SAS has a number of options:
  
* -work /location : This is the location of your work area. It is needed if it is not defined in your SAS initialization files.
 
 
* -nodms : This stops SAS from using its GUI capability and goes into a text only mode.
 
* -nodms : This stops SAS from using its GUI capability and goes into a text only mode.
 
* -filelocks none : This causes SAS to not use filelocking.
 
* -filelocks none : This causes SAS to not use filelocking.
 +
* -nonews: Prevents SAS from printing a useless header to the output.
 +
* -memsize xxxxM - [http://support.sas.com/documentation/cdl/en/hostunx/63053/HTML/default/viewer.htm#n09y5anvvpzrmnn0ztkyf59qgzvr.htm specifies the total amount of memory that is available to each SAS session, and places an enforced limit on the amount of virtual memory that SAS can dynamically allocate at any one time].
 +
* -realmemsize xxxxM  - sets the [http://support.sas.com/documentation/cdl/en/hostunx/63053/HTML/default/viewer.htm#p00ta2t9eibgofn19cmdlpdfgjnw.htm recommended upper limit on working memory for procedures that can use both memory and utility disk space, such as PROC SUMMARY and PROC SORT, so that they can avoid virtual memory thrashing].
 +
* -work $TMPDIR - the directory where SAS should store its temporary files. Using $TMPDIR will allow your program to run much faster and prevent any network-related file access issues that SAS is prone to run into.
 
* -sysin file : Designate a file for SAS to load as its input.
 
* -sysin file : Designate a file for SAS to load as its input.
==Batch Submission==
+
===Batch Submission===
To do a batch submission to SAS, you would use the '''-sysin''' command line option and write all of your SAS commands in a file. In this way you can submit jobs to the batch queue system to run your jobs on the rest of the cluster.
+
To do a batch submission of a SAS script use the '''-sysin''' command line option and write all of your SAS commands in a file. In this way you can submit jobs to the batch queue system to run your jobs on the cluster.
 +
===-work directory===
 +
SAS is a mature product with a long history behind it. In a modern high-performance environment it means that additional actions need to be taken to mitigate potential issues stemming from SASs focus on filesystem I/O instead of using memory.  
  
 +
===Interactive Use===
 +
To connect to a test node, for example test01, where you can run the graphical interface to SAS use the following or a similar command
 +
ssh USER@test01.ufhpc -o ForwardX11=yes -o ForwardX11Trusted=yes -o ProxyCommand='ssh USER@submit.hpc.ufl.edu exec nc test01 %p'
 +
or add the following to your ~/.ssh/config file:
 +
Host test01
 +
    User USER
 +
    KeepAlive yes
 +
    ProxyCommand ssh USER@submit.hpc.ufl.edu exec nc test01 %p
 +
    ForwardX11 yes
 +
    ForwardX11Trusted yes
 +
where USER is your username. After editing ~/.ssh/config you can just run the following command to connect:
 +
ssh test01
 +
 +
Once connected run
 +
module load sas
 +
sas
 +
|}}
 +
{{#if: {{#var: conf}}|==Configuration==
 +
See the [[{{PAGENAME}}_Configuration]] page for {{#var: app}} configuration details.|}}
 +
==PBS Script Examples==
 
===Sample Job Script Ver 9.2===
 
===Sample Job Script Ver 9.2===
 
<pre>
 
<pre>
 
#!/bin/bash
 
#!/bin/bash
 
#
 
#
#PBS -r n
+
#PBS -N Research
#PBS -N Market Research
 
 
#PBS -o market.out
 
#PBS -o market.out
 +
#PBS -e market.err
 
#PBS -m abe
 
#PBS -m abe
 
#PBS -M <EMAIL ADDRESS>
 
#PBS -M <EMAIL ADDRESS>
#PBS -e market.err
 
 
#PBS -l nodes=1:ppn=1
 
#PBS -l nodes=1:ppn=1
#PBS -l pmem=600mb
+
#PBS -l pmem=1gb
#PBS -l walltime=00:15:00
+
#PBS -l walltime=01:00:00
  
EXE=/apps/sas/9.2/SASFoundation/9.2/sas
+
module load sas/9.2
 
cd /scratch/ufhpc/
 
cd /scratch/ufhpc/
  
$EXE -memsize 1024M -nodms -work `pwd` -filelocks none -sysin sas.inp
+
sas -memsize 1024M -nonews -nodms -work $TMPDIR -filelocks none -sysin sas.inp
 
</pre>
 
</pre>
  
Line 42: Line 92:
 
#!/bin/bash
 
#!/bin/bash
 
#
 
#
#PBS -r n
+
#PBS -N Research
#PBS -N Market Research
 
 
#PBS -o market.out
 
#PBS -o market.out
 +
#PBS -e market.err
 
#PBS -m abe
 
#PBS -m abe
 
#PBS -M <EMAIL ADDRESS>
 
#PBS -M <EMAIL ADDRESS>
#PBS -e market.err
 
 
#PBS -l nodes=1:ppn=1
 
#PBS -l nodes=1:ppn=1
#PBS -l pmem=600mb
+
#PBS -l pmem=1gb
#PBS -l walltime=00:15:00
+
#PBS -l walltime=01:00:00
 
 
EXE=/apps/sas/9.3/SASFoundation/9.3/sas
 
cd /scratch/ufhpc/
 
 
 
$EXE -memsize 1024M -nodms -work `pwd` -filelocks none -sysin sas.inp
 
</pre>
 
 
 
==Possible Errors==
 
* If the following error appears:
 
<pre>
 
ERROR: A lock is not available for #C00002.CORE.CATALOG, lock held by another
 
      process.
 
FATAL: Unable to initialize the options subsystem.
 
ERROR:  (SASXKINI): PHASE 3 KERNEL INITIALIZATION FAILED.
 
UNABLE TO INITIALIZE THE SAS KERNEL
 
</pre>
 
You will need to use the '''-filelocks none''' option as described above when starting SAS.
 
<pre>
 
ERROR: Library WORK does not exist.
 
FATAL: Unable to initialize the options subsystem.
 
ERROR:  (SASXKINI): PHASE 3 KERNEL INITIALIZATION FAILED.
 
UNABLE TO INITIALIZE THE SAS KERNEL
 
</pre>
 
You need to designate where your work area is with the '''-work LOCATION''' directive.
 
<pre>
 
X11 connection rejected because of wrong authentication.
 
Fatal I/O error
 
The network connection has been lost.
 
 
 
Traceback
 
Traceback
 
  
 +
module load sas
 +
cd /scratch/hpc/USERNAME
  
No Traceback Available
+
sas -memsize 1024M -nodms -nonews -work $TMPDIR -filelocks none -sysin sas.inp
No Traceback Available
 
 
</pre>
 
</pre>
You apparently do not have X11 access to the node, and you will have to use it through a console mode using the '''-nodms''' directive.
+
{{#if: {{#var: policy}}|==Usage policy==
* If this is the case and you want X11 access to the node, connecting to the node with '''ssh -X''' should fix the problem for you. Remember that this has to pass through every node that you are connected through in the chain, so if you connect to the submission node first, that node must be connected with the -X option, then the node that you connect to do the actual interactive work must also be connected to in this fashion.
+
WRITE USAGE POLICY HERE (perhaps templates for a couple of main licensing schemes can be used)|}}
 +
{{#if: {{#var: testing}}|==Performance==
 +
WRITE PERFORMANCE TESTING RESULTS HERE|}}
 +
{{#if: {{#var: faq}}|==FAQ==
 +
*'''Q:''' **'''A:'''|}}

Revision as of 15:05, 5 April 2012

Description

{{{name}}} website  
SAS is a commercial integrated system for statistical analysis, data mining, and graphics as well as many enterprise oriented additional features. SAS cost and the breadth of SAS features means that both a significant monetary investment and a substantial time investment are required to master it. SAS 9.3 Documentation is vast. For research purposes the SAS 9.3 User's Guide is a thorough reference for the functions and procedures you may need to do the statistical analysis using SAS. Template:App Location

Available versions

  • 9.2.
  • 9.3 (default, supported).

Running the application using modules

To use sas with the environment modules system at HPC the following commands are available:

Get module information for sas:

$module spider sas

Load the default application module:

$module load sas

The modulefile for this software adds the directory with executable files to the shell execution PATH and sets the following environment variables:

  • HPC_SAS_DIR - directory where sas is located.

How To Run

Commands

SAS has a number of options:

Batch Submission

To do a batch submission of a SAS script use the -sysin command line option and write all of your SAS commands in a file. In this way you can submit jobs to the batch queue system to run your jobs on the cluster.

-work directory

SAS is a mature product with a long history behind it. In a modern high-performance environment it means that additional actions need to be taken to mitigate potential issues stemming from SASs focus on filesystem I/O instead of using memory.

Interactive Use

To connect to a test node, for example test01, where you can run the graphical interface to SAS use the following or a similar command

ssh USER@test01.ufhpc -o ForwardX11=yes -o ForwardX11Trusted=yes -o ProxyCommand='ssh USER@submit.hpc.ufl.edu exec nc test01 %p'

or add the following to your ~/.ssh/config file:

Host test01
   User USER
   KeepAlive yes
   ProxyCommand ssh USER@submit.hpc.ufl.edu exec nc test01 %p
   ForwardX11 yes
   ForwardX11Trusted yes

where USER is your username. After editing ~/.ssh/config you can just run the following command to connect:

ssh test01

Once connected run

module load sas
sas

PBS Script Examples

Sample Job Script Ver 9.2

#!/bin/bash
#
#PBS -N Research
#PBS -o market.out
#PBS -e market.err
#PBS -m abe
#PBS -M <EMAIL ADDRESS>
#PBS -l nodes=1:ppn=1
#PBS -l pmem=1gb
#PBS -l walltime=01:00:00

module load sas/9.2
cd /scratch/ufhpc/

sas -memsize 1024M -nonews -nodms -work $TMPDIR -filelocks none -sysin sas.inp

Sample Job Script Ver 9.3

#!/bin/bash
#
#PBS -N Research
#PBS -o market.out
#PBS -e market.err
#PBS -m abe
#PBS -M <EMAIL ADDRESS>
#PBS -l nodes=1:ppn=1
#PBS -l pmem=1gb
#PBS -l walltime=01:00:00

module load sas
cd /scratch/hpc/USERNAME

sas -memsize 1024M -nodms -nonews -work $TMPDIR -filelocks none -sysin sas.inp