Difference between revisions of "Trinity"

From UFRC
Jump to navigation Jump to search
Line 33: Line 33:
 
If the run produces an error that states that java could not create a virtual machine due to insufficient heap memory you can set the java memory with an  
 
If the run produces an error that states that java could not create a virtual machine due to insufficient heap memory you can set the java memory with an  
 
  export _JAVA_OPTIONS="-Xmx2g"
 
  export _JAVA_OPTIONS="-Xmx2g"
command either at the command line if doing an interactive run on a test node or in the job script. Make sure that the value in the "-Xmx" is less then the amount of memory you requested from the batch system.
+
command either at the command line if doing an interactive run on a test node or in the job script. Make sure that the value in the "-Xmx" is less than the amount of memory you requested from the batch system.
  
 
The default Butterfly memory setting in the Trinity.pl controller script is '-Xmx20G', so plan your job resource request accordingly.
 
The default Butterfly memory setting in the Trinity.pl controller script is '-Xmx20G', so plan your job resource request accordingly.
 
|}}
 
|}}

Revision as of 10:19, 16 November 2012

Description

trinity website  

Trinity, developed at the Broad Institute and the Hebrew University of Jerusalem, represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data. Trinity combines three independent software modules: Inchworm, Chrysalis, and Butterfly, applied sequentially to process large volumes of RNA-seq reads. Trinity partitions the sequence data into many individual de Bruijn graphs, each representing the transcriptional complexity at at a given gene or locus, and then processes each graph independently to extract full-length splicing isoforms and to tease apart transcripts derived from paralogous genes. Briefly, the process works like so:

Inchworm assembles the RNA-seq data into the unique sequences of transcripts, often generating full-length transcripts for a dominant isoform, but then reports just the unique portions of alternatively spliced transcripts.

Chrysalis clusters the Inchworm contigs into clusters and constructs complete de Bruijn graphs for each cluster. Each cluster represents the full transcriptonal complexity for a given gene (or sets of genes that share sequences in common). Chrysalis then partitions the full read set among these disjoint graphs.

Butterfly then processes the individual graphs in parallel, tracing the paths that reads and pairs of reads take within the graph, ultimately reporting full-length transcripts for alternatively spliced isoforms, and teasing apart transcripts that corresponds to paralogous genes.

Required Modules

modules documentation

Serial

  • trinity

System Variables

  • HPC_{{#uppercase:trinity}}_DIR - installation directory
  • ALLPATHSLG_BASEDIR - Allpaths-LG installation directory

Additional Information

To run Trinity after you load the module use the "Trinity.pl" perl script, or run_Trinity.sh shell script, which is particularly useful as the optional ALLPATHS-LG software is enabled at UF HPC.

If the run produces an error that states that java could not create a virtual machine due to insufficient heap memory you can set the java memory with an

export _JAVA_OPTIONS="-Xmx2g"

command either at the command line if doing an interactive run on a test node or in the job script. Make sure that the value in the "-Xmx" is less than the amount of memory you requested from the batch system.

The default Butterfly memory setting in the Trinity.pl controller script is '-Xmx20G', so plan your job resource request accordingly.