Difference between revisions of "PASA"

From UFRC
Jump to navigation Jump to search
 
(9 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
__NOTOC__
 
__NOTOC__
 
__NOEDITSECTION__
 
__NOEDITSECTION__
[[Category:Software]][[Category:Bioinformatics]][[Category:Genomics]]
+
[[Category:Software]][[Category:Biology]][[Category:Genomics]][[Category:NGS]]
 
{|<!--Main settings - REQUIRED-->
 
{|<!--Main settings - REQUIRED-->
 
|{{#vardefine:app|pasa}}
 
|{{#vardefine:app|pasa}}
Line 20: Line 20:
 
The Program to Assemble Spliced Alignments (PASA) is used to automatically incorporate ESTs and full-length cDNAs into gene structure annotations, in the process annotating UTRs, alternative splicing variations, and polyadenylation sites.
 
The Program to Assemble Spliced Alignments (PASA) is used to automatically incorporate ESTs and full-length cDNAs into gene structure annotations, in the process annotating UTRs, alternative splicing variations, and polyadenylation sites.
 
<!--Modules-->
 
<!--Modules-->
==Required Modules==
+
==Environment Modules==
[[Modules|modules documentation]]
+
Run <code>module spider {{#var:app}}</code> to find out what environment modules are available for this application.
===Serial===
+
See [[Modules|modules documentation]] for details.
*{{#var:app}}
 
or
 
* gcc/5.2.0
 
* {{#var:app}}
 
See 'module spider pasa' and check the version you want to run with module spider pasa/$version to know what modules to load.
 
 
==System Variables==
 
==System Variables==
* HPC_{{#uppercase:{{#var:app}}}}_DIR - installation directory
+
* HPC_{{uc:{{#var:app}}}}_DIR - installation directory
 
* HPC_PASA_BIN - executable directory.
 
* HPC_PASA_BIN - executable directory.
 
* HPC_PASA_CONF - config file directory.
 
* HPC_PASA_CONF - config file directory.
Line 36: Line 31:
 
===Tools and variables===
 
===Tools and variables===
 
* <code>UNIVECDB</code> environment variable is set to the location of the UniVec database for use by seqclean.
 
* <code>UNIVECDB</code> environment variable is set to the location of the UniVec database for use by seqclean.
 +
 +
===Database Removal Policy===
 +
See [[Database_Removal_Procedure|the Database Removal Procedure]] for details on how application databases are removed or retained by UFRC.
 +
 
===Databases===
 
===Databases===
;Note: we have a single DB server that supports all applications that make use of databases. Use [[SLURM Job Arrays]] and throttle you jobs to at most 10 active tasks at a time.
+
;Note that we have a single DB server that supports all applications that make use of databases. Use [[SLURM Job Arrays]] and throttle you jobs to at most 10 active tasks at a time to avoid overwhelming it.
  
* The MySQL server used by PASA has been configured to allow the pasaadmin user to create databases with names in the form of "pasa_${USER}_". So, to run pasa your configuration file such as alignAssembly.config or conf.txt must have MySQL configuration similar to the following entry:
+
The MySQL server used by PASA has been configured to allow the pasaadmin user to create databases with names in the form of "pasa_${USER}_". So, to run pasa your configuration file such as alignAssembly.config or conf.txt must have MySQL configuration similar to the following entry:
 
  MYSQLDB=pasa_jdoe_run1
 
  MYSQLDB=pasa_jdoe_run1
  
Line 47: Line 46:
  
 
Please contact us if you would like to prevent automatic removal of old pasa databases. We may remove all databases over six months old.
 
Please contact us if you would like to prevent automatic removal of old pasa databases. We may remove all databases over six months old.
 +
 +
Use helper tools we created for managing PASA databases. The following tools are available:
 +
 +
* rc_list_pasa_databases
 +
: List all databases named pasa_% on the server
 +
* rc_create_pasa_db NAME
 +
: Create an empty database with name 'NAME'. Name your databases as shown above
 +
* rc_export_pasa_db NAME
 +
: Save your PASA database named 'NAME' to a file named 'NAME.sql' in the current working directory. Use this to backup databases, so they don't get lost in a cleanup if you may need them later.
 +
* rc_import_pasa_db NAME.sql
 +
: Create a database called 'NAME' and import structure and data from an sql file called 'NAME.sql'
 +
* rc_remove_pasa_db
 +
: Remove a database you no longer need from the server.
 +
 
|}}
 
|}}
 
===Web Portal===
 
===Web Portal===

Latest revision as of 21:04, 4 January 2023

Description

pasa website  

The Program to Assemble Spliced Alignments (PASA) is used to automatically incorporate ESTs and full-length cDNAs into gene structure annotations, in the process annotating UTRs, alternative splicing variations, and polyadenylation sites.

Environment Modules

Run module spider pasa to find out what environment modules are available for this application. See modules documentation for details.

System Variables

  • HPC_PASA_DIR - installation directory
  • HPC_PASA_BIN - executable directory.
  • HPC_PASA_CONF - config file directory.

Additional Information

Tools and variables

  • UNIVECDB environment variable is set to the location of the UniVec database for use by seqclean.

Database Removal Policy

See the Database Removal Procedure for details on how application databases are removed or retained by UFRC.

Databases

Note that we have a single DB server that supports all applications that make use of databases. Use SLURM Job Arrays and throttle you jobs to at most 10 active tasks at a time to avoid overwhelming it.

The MySQL server used by PASA has been configured to allow the pasaadmin user to create databases with names in the form of "pasa_${USER}_". So, to run pasa your configuration file such as alignAssembly.config or conf.txt must have MySQL configuration similar to the following entry:

MYSQLDB=pasa_jdoe_run1

The example configuration files are located in $HPC_PASA_CONF directory.

Make sure to name your pasa database as 'pasa_${USER}_$NAME' where '$NAME' is the name you want to give the database and '$USER' is your username. For example a database can be named as 'pasa_jdoe_experiment_1'.

Please contact us if you would like to prevent automatic removal of old pasa databases. We may remove all databases over six months old.

Use helper tools we created for managing PASA databases. The following tools are available:

  • rc_list_pasa_databases
List all databases named pasa_% on the server
  • rc_create_pasa_db NAME
Create an empty database with name 'NAME'. Name your databases as shown above
  • rc_export_pasa_db NAME
Save your PASA database named 'NAME' to a file named 'NAME.sql' in the current working directory. Use this to backup databases, so they don't get lost in a cleanup if you may need them later.
  • rc_import_pasa_db NAME.sql
Create a database called 'NAME' and import structure and data from an sql file called 'NAME.sql'
  • rc_remove_pasa_db
Remove a database you no longer need from the server.

Web Portal

You can visualize the results of your PASA run by going to "http://pasa2.rc.ufl.edu/cgi-bin/status_report.cgi?db=$MYSQL_DB" URI where "$MYSQL_DB" must be changed to the database name used for the run. Use your HPC credentials to access the protected web portal.

Note: Pasa2 web interface no longer has an administrative page. Go directly to status_report.cgi.

Pasa 1.5 URIs can be reached at http://pasa.rc.ufl.edu.