RepeatAnalysisTools

From UFRC
Jump to navigation Jump to search

Description

RepeatAnalysisTools website  

This repository contains instructions for processing and repeat analysis of sequence data generated with the PacBio No-Amp Targeted Sequencing Protocol with simplified double Cas9 cut.

UPDATE: RepeatAnalysis Tools in this repository now use Python 3.

Outputs from the analysis scripts include high-accuracy (>=QV20) CCS sequences for target regions so that users can easily analyze the results with other third party tools as necessary.

Environment Modules

Run module spider RepeatAnalysisTools to find out what environment modules are available for this application.

System Variables

  • HPC_RATOOLS_DIR - installation directory
  • HPC_RATOOLS_BIN - executable directory

Additional Information

To utilize any of the BASH or Python scripts provided with the tools (i.e. the files ending with ".sh" or ".py", you will need to prefix the script name with ${HPC_RATOOLS_DIR}/

For example, a preprocess.sh command might look something like:

${HPC_RATOOLS_DIR}/preprocess.sh \                                                                  
   m64012_191221_044659.subreads.bam \                                                             
   m64012_191221_044659.adapters.fasta \                                                           
   /data/reference/genomes/human/hs37d5/hs37d5.fa \                                                
   ./output \                                                                                         
   16 \                                                                                            
   16 \                                                                                            
   local



Citation

If you publish research that uses RepeatAnalysisTools you have to cite it as follows:

Expand this section to view citation instructions.

@software{tange_2021_5013933,
     author       = {Tange, Ole},
     title        = {GNU Parallel 20210622 ('Protasevich')},
     month        = Jun,
     year         = 2021,
     note         = {{GNU Parallel is a general parallelizer to run
                      multiple serial command line programs in parallel
                      without changing them.}},
     publisher    = {Zenodo},
     doi          = {10.5281/zenodo.5013933},
     url          = {https://doi.org/10.5281/zenodo.5013933}