Difference between revisions of "BEDTools"

From UFRC
Jump to navigation Jump to search
m (Text replace - "<!--Location--> {{App_Location|app={{#var:app}}|{{#var:ver}}}}" to "")
m (Text replace - "==Running the application using modules==" to "==Execution Environment and Modules==")
Line 49: Line 49:
 
* 2.16.2
 
* 2.16.2
 
<!-- -->
 
<!-- -->
{{#if: {{#var: mod}}|==Running the application using modules==
+
{{#if: {{#var: mod}}|==Execution Environment and Modules==
 
{{App_Module|app={{#var:app}}|intel={{#var:intel}}|mpi={{#var:mpi}}}}|}}
 
{{App_Module|app={{#var:app}}|intel={{#var:intel}}|mpi={{#var:mpi}}}}|}}
 
*HPC_BEDTOOLS_BIN - the bin directory
 
*HPC_BEDTOOLS_BIN - the bin directory

Revision as of 02:02, 10 August 2012

Description

bedtools website  

The BEDTools utilities allow one to address common genomics tasks such as finding feature overlaps and computing coverage. The utilities are largely based on four widely-used file formats: BED, GFF/GTF, VCF, and SAM/BAM. Using BEDTools, one can develop sophisticated pipelines that answer complicated research questions by "streaming" several BEDTools together. The following are examples of common questions that one can address with BEDTools.

  1. Intersecting two BED files in search of overlapping features.
  2. Culling/refining/computing coverage for BAM alignments based on genome features.
  3. Merging overlapping features.
  4. Screening for paired-end (PE) overlaps between PE sequences and existing genomic features.
  5. Calculating the depth and breadth of sequence coverage across defined "windows" in a genome.
  6. Screening for overlaps between "split" alignments and genomic features.

The fact that all of the BEDTools accept input from “standard input (stdin)” allows one to “stream / pipe” several commands together to facilitate more complicated analyses. Also, the tools allow fine control over how output is reported. Most recently, I have added support for sequence alignments in BAM ([1]) format, as well as for features in VCF and GFF, as well as “blocked” BED format. The tools are quite fast and typically finish in a matter of a few seconds, even for large datasets. Upstream documentation for bedtools.

Available versions

  • 2.16.2

Execution Environment and Modules

To use bedtools with the environment modules system at HPC the following commands are available:

Get module information for bedtools:

$module spider bedtools

Load the default application module:

$module load bedtools

The modulefile for this software adds the directory with executable files to the shell execution PATH and sets the following environment variables:

  • HPC_BEDTOOLS_DIR - directory where bedtools is located.
  • HPC_BEDTOOLS_BIN - the bin directory
  • HPC_BEDTOOLS_DATA - BEDTools data directory
  • HPC_BEDTOOLS_GENOMES - BEDTools genomes directory
  • HPC_BEDTOOLS_SCRIPTS - BEDTools scripts directory