Search results

Category:Alignment
Biological sequence alignment tools and documentation

24 members (0 subcategories, 0 files) - 15:24, 23 August 2022
WebLogo
WebLogo is an application designed to make the generation of sequence logos as easy and painless as possible. ...precise description of, for example,a binding site, than would a consensus sequence.

3 KB (417 words) - 17:40, 22 August 2022
Category:Assembly
Short Sequence Assembly software.

20 members (0 subcategories, 0 files) - 15:24, 23 August 2022
Infernal
Infernal ("INFERence of RNA ALignment") is for searching DNA sequence databases for RNA structure and sequence similarities. It is an implementation of a

3 KB (305 words) - 19:31, 24 August 2022
EvoLSTM
...robabilities. EvoLSTM brings modern machine-learning approaches to bear on sequence evolution. It will serve as a useful tool to study and simulate complex mut

3 KB (309 words) - 18:52, 12 August 2022
Sim4
...verlaps the end of the other). If seqfile2 is a database of sequences, the sequence in seqfile1 will be aligned with each of the sequences in seqfile2.

2 KB (302 words) - 20:41, 12 August 2022
RNAsalsa
...e, that RNAsalsa uses structure information for adjusting and refining the sequence alignment and vice versa.

3 KB (368 words) - 21:24, 6 December 2019
Spruceup
...ts (alignment rows), which is different from the problem of poorly aligned sequence blocks (alignment columns) commonly addressed by alignment trimming softwar ...identification, visualization, and removal of outliers from large multiple sequence alignments. Journal of Open Source Software, 4(42), 1635, https://doi.org/1

3 KB (317 words) - 21:24, 6 December 2019
CD-HIT
...-hit produces a set of closely related protein families from a given fasta sequence database.

3 KB (354 words) - 18:24, 12 August 2022
SuperCRUNCH
...orking with phylogenetic datasets. SuperCRUNCH can be run using any set of sequence data, as long as sequences are in fasta format with standard naming convent ...s (adjust sequence directions, adjust reading frames), several options for sequence alignment (Clustal-O, MAFFT, Muscle, MACSE), and multiple options for align

4 KB (499 words) - 14:47, 11 May 2020
RepeatMasker
sequence as well as a modified version of the query sequence in which On average, almost 50% of a human genomic DNA sequence currently will

2 KB (325 words) - 20:32, 12 August 2022
Minigraph
...graph constructor. It finds approximate locations of a query sequence in a sequence graph and incrementally augments an existing graph with long query subseque

2 KB (279 words) - 18:08, 31 March 2021
Sputnik
...the resulting hits are written to stdout along with their position in the sequence, length, and a score determined by the length of the repeat and the number

2 KB (311 words) - 15:50, 22 August 2022
FAST
...BioPerl FastA format). FAST tools expose the power of Perl and BioPerl for sequence analysis to non-programmers in an easy-to-learn command-line paradigm.

3 KB (358 words) - 21:21, 6 December 2019
Artemis
...next generation data and the results of analyses within the context of the sequence, and also its six-frame translation. ...: an integrated platform for visualization and analysis of high-throughput sequence-based experimental data.

3 KB (378 words) - 14:53, 12 August 2022
Mlstcheck
...for each locus, and the ST (or nearest ST), the other contains the genomic sequence for each allele. ...us, the contaminated flag is set. Optionally you can output a concatenated sequence in FASTA format, which you can then use with tree building programs. New, u

3 KB (395 words) - 18:06, 27 May 2022
TNTBLAST
...ideally) match experimental PCR results. To enable searching of very large sequence databases (i.e. all of Genbank), ThermonucleotideBLAST can use run-time dat

3 KB (385 words) - 20:08, 2 June 2022
Seq crumbs
seq_crumbs aims to be a collection of small sequence processing utilities. ...and most of them take a sequence file as input and create a new processed sequence file as output. This design encourages the assembly of the seq_crumbs utili

3 KB (328 words) - 15:12, 27 May 2022
HMMER
HMMER is used for searching sequence databases for homologs of protein sequences, and for making protein sequence alignments. It implements methods

3 KB (397 words) - 18:18, 15 August 2022
Impg
impg (Implicit Pangenome Graph) projects sequence ranges through many-way (e.g. all-vs-all) pairwise alignments built by tool ...htforward to use to extract FASTA sequences for downstream use in multiple sequence alignment (like mafft) or pangenome graph building (e.g., pggb or minigraph

3 KB (392 words) - 21:22, 10 April 2024
Reference Indexes
...rence indexes built for the respective tools using the appropriate genomic sequence data. This page provides a short overview of our reference index building p ...most reference indexes to be built. In that case we'll use the non-masked sequence.

2 KB (361 words) - 20:30, 12 August 2022
Consed
calling, sequence comparisons, and sequence assembly. Phred, Cross_match, Consed/Autofinish is a tool for viewing, editing, and finishing sequence assemblies created with phrap. Finishing capabilities include allowing the

3 KB (348 words) - 18:39, 12 August 2022
UCSC
...ion tasks such as reverse complementation, codon and amino acid lookup and sequence translation, as well as functions specifically designed for extracting, loa

3 KB (349 words) - 20:53, 12 August 2022
WGS Assembler
...n sequence of a multi-cellular organism (Myers 2000) and the first diploid sequence of an individual human (Levy 2007). Celera Assembler was developed at Celer

3 KB (359 words) - 20:56, 12 August 2022
SRAssembler
...cal assembly of genomic regions matching a homologous query protein or DNA sequence. ...ssembler first collects the reads that can be locally aligned to the query sequence and assembles them into contigs. Additional reads are then found by alignin

3 KB (381 words) - 15:33, 27 May 2022
Kalign
Kalign is a fast multiple sequence alignment program for biological sequences. "Kalign 3: multiple sequence alignment of large data sets."

2 KB (265 words) - 19:28, 12 August 2022
USEARCH
...tions, including E-values, identity, coverage (fraction of query or target sequence covered by the alignment) and maximum gap length, and a range of output fil ...RCH are new algorithms enabling sensitive local and global search of large sequence databases at exceptionally high speeds. They are often orders of magnitude

4 KB (559 words) - 17:19, 22 August 2022
PRINSEQ
...n in Perl and can be helpful if you want to filter, reformat, or trim your sequence data. It also generates basic statistics for your sequences.

2 KB (271 words) - 15:52, 10 June 2022
Wgs
...n sequence of a multi-cellular organism (Myers 2000) and the first diploid sequence of an individual human (Levy 2007). Celera Assembler was developed at Celer

3 KB (375 words) - 20:55, 12 August 2022
Nt
This repo contains a set of neural transducer, e.g. sequence-to-sequence model, focusing on character-level tasks. It powers several papers and work

2 KB (262 words) - 14:38, 15 July 2022
MHAP
...rithm. Designed to efficiently detect all overlaps between noisy long-read sequence data. It efficiently estimates Jaccard similarity by compressing sequences

2 KB (270 words) - 16:15, 10 June 2022
SEQPower
SEQPower provides statistical power analysis and sample size estimation for sequence-based association studies. ...tez, B. Peng and S. M. Leal, Power analysis and sample size estimation for sequence-based association studies. Bioinformatics. (2014)]

2 KB (274 words) - 22:38, 21 August 2022
GHOST-MP
...ches for similar sequences among nucleotide query sequences and amino acid sequence database like BLASTX. GHOST-MP runs on a distributed memory system and proc

2 KB (285 words) - 20:11, 7 May 2020
Shasta
...of the Shasta long read assembler is to rapidly produce accurate assembled sequence using as input DNA reads generated by Oxford Nanopore flow cells. Using a run-length representation of the read sequence. This makes the assembly process more resilient to errors in homopolymer re

3 KB (402 words) - 21:24, 6 December 2019
Ont tutorial cas9
...t enrichment strategy. This workflow is suitable for Oxford Nanopore fastq sequence collections and requires a reference genome and a BED file of target coordi

2 KB (289 words) - 18:38, 2 June 2022
Tcoffee
T-Coffee is a multiple sequence alignment package. You can use T-Coffee to align sequences or to combine th T-Coffee: A novel method for multiple sequence alignments.

2 KB (280 words) - 20:46, 12 August 2022
NCBI GenBank Tools
..., 2013 Jan;41(D1):D36-42). GenBank is part of the International Nucleotide Sequence Database Collaboration, which comprises the DNA DataBank of Japan (DDBJ), t

2 KB (291 words) - 14:33, 19 August 2022
ProDy
...s for comparative analysis and modeling of protein structural dynamics and sequence co-evolution. Fast and flexible ProDy API is for interactive usage as well

2 KB (294 words) - 19:38, 21 August 2022
Newbler
...de novo DNA sequence assembly. It is designed specifically for assembling sequence data generated by the 454 GS-series of pyrosequencing platforms sold by 454

2 KB (305 words) - 16:17, 19 August 2022
Rvtests
...tests, is a flexible software package for genetic association analysis for sequence datasets. ...ficient and Comprehensive Tool for Rare Variant Association Analysis Using Sequence Data. Bioinformatics 2016 32: 1423-1426.]

3 KB (293 words) - 21:59, 21 August 2022
GeneMarkS
....hmm (P and E) programs identify the maximum likely parse of the whole DNA sequence into protein coding genes (with possible introns) and intergenic regions.

2 KB (296 words) - 16:45, 27 May 2022
BloomTree
The Sequence Bloom Tree (SBT) is a method that will allow you to index a set of sequence. The code base provided here is an implementation of SBT written in

3 KB (302 words) - 16:49, 10 June 2022
MetaGeneMark
....hmm (P and E) programs identify the maximum likely parse of the whole DNA sequence into protein coding genes (with possible introns) and intergenic regions.

2 KB (298 words) - 21:22, 6 December 2019
GeneMarkS-T
....hmm (P and E) programs identify the maximum likely parse of the whole DNA sequence into protein coding genes (with possible introns) and intergenic regions.

2 KB (296 words) - 17:25, 2 June 2022
DEXTRACTOR
RS II sequencer. Generally speaking, this information is the sequence of all the reads to produce a highly accurate consensus sequence as the last step in the assembly

3 KB (441 words) - 19:20, 10 June 2022
HAPCUT
...ts, and outputs the phased haplotype blocks that can be assembled from the sequence reads.

2 KB (298 words) - 18:13, 15 August 2022
Ncbi cli
Use it to find and download sequence, annotation, and metadata for genes and genomes Use '''datasets''' to download biological sequence data across all domains of life from NCBI.

3 KB (306 words) - 15:47, 9 June 2023
Fairseq
Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom mod A Fast, Extensible Toolkit for Sequence Modeling. Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and

3 KB (296 words) - 15:18, 15 August 2022
YASRA
...improve the performance, both in runtime and quality for 454 and Illumina sequence reads.

3 KB (304 words) - 20:56, 12 August 2022
GeneMark-ES
....hmm (P and E) programs identify the maximum likely parse of the whole DNA sequence into protein coding genes (with possible introns) and intergenic regions. F

3 KB (303 words) - 16:44, 27 May 2022
MMseqs2
...ing) is a software suite to search and cluster huge protein and nucleotide sequence sets. MMseqs2 is open source GPL-licensed software implemented in C++ for L ...Soeding J. MMseqs2 desktop and local web server app for fast, interactive sequence searches. Bioinformatics, doi: 10.1093/bioinformatics/bty1057 (2019).]

3 KB (413 words) - 20:22, 2 October 2023
Wise
...e focused on comparisons of biopolymers, commonly DNA sequence and protein sequence. There are many other packages which do this, probably the best known being

3 KB (321 words) - 21:29, 6 December 2019
SNP-Pipeline
SNP Pipeline is a pipeline for the production of SNP matrices from sequence data used in the phylogenetic analysis of pathogenic organisms sequenced fr ...ne: an automated method for constructing SNP matrices from next-generation sequence data. PeerJ Computer Science 1:e20 https://doi.org/10.7717/peerj-cs.20]

3 KB (306 words) - 17:00, 10 June 2022
Catch
...ackage for designing probe sets to use for nucleic acid capture of diverse sequence. of species. It allows blacklisting sequence from the design (e.g., background in microbial enrichment),

3 KB (422 words) - 18:23, 12 August 2022
DeconSeq
The DeconSeq tool can be used to automatically detect and efficiently remove sequence contaminations from genomic and metagenomic datasets. It is easily configur ...med/21278185 Schmieder R and Edwards R: Fast identification and removal of sequence contamination from genomic and metagenomic datasets. PLoS ONE 2011, 6:e1728

3 KB (299 words) - 17:00, 10 June 2022
Recycler
...from sequence data of isolate microbial genomes, plasmidome and metagenome sequence data.

3 KB (297 words) - 14:21, 11 October 2022
FSA
...using only pairwise estimations of homology. This is made possible by the sequence annealing technique for constructing a multiple alignment from pairwise com

3 KB (307 words) - 20:38, 23 May 2022
Ancestryhmm
...neously estimating local ancestry and admixture time using next generation sequence data in samples of arbitrary ploidy. ...neously estimating local ancestry and admixture time using next generation sequence data in samples of arbitrary ploidy. PLoS genetics, 13(1), p.e1006529.

2 KB (291 words) - 21:25, 9 May 2023
AliStat
Multiple sequence alignments may contain a variety of completely-specified characters. Here a for multiple sequence alignments. NAR Genomics and Bioinformatics 2 (2), lqaa024.

3 KB (315 words) - 20:18, 1 October 2020
REPdenovo
REPdenovo is designed for constructing repeats directly from sequence reads. It based on the idea of frequent k-mer assembly. REPdenovo provides ...elsen and Yufeng Wu, REPdenovo: Inferring de novo repeat motifs from short sequence reads, PLoS One 11.3 (2016): e0150719.]

3 KB (327 words) - 21:43, 21 August 2022
SeSiMCMC
The Sequence Similarities by Markov Chain Monte Carlo (SeSiMCMC) algorithm careful Bayesian analysis to consider site absence in a sequence.

3 KB (299 words) - 22:41, 21 August 2022
HLAminer
...and align the resulting contigs to reference HLA alleles from the IMGT/HLA sequence repository using commodity hardware with standard specifications (<2GB RAM, ...novo assembly of all recruited reads, a set of contigs is generated. Only sequence contigs equal or larger than 200nt in length are considered for further ana

4 KB (541 words) - 14:55, 14 December 2022
Magus
MAGUS is a tool for piecewise large-scale multiple sequence alignment. Original MAGUS paper: Smirnov, V. and Warnow, T., 2020. MAGUS: Multiple Sequence Alignment using Graph Clustering. Bioinformatics. https://doi.org/10.1093/b

3 KB (327 words) - 17:51, 16 March 2022
CRISP
...d using the Illumina sequencing platform. In principle, it should work for sequence data from other sequencing platforms. The method requires each pool to be s

3 KB (342 words) - 21:01, 6 December 2019
MetaCluster
...nomic sequences. Existing binning methods based on sequence similarity and sequence composition markers rely heavily on the reference genomes of known microorg

3 KB (321 words) - 19:26, 18 August 2022
RSeQC
...put sequence data especially RNA-seq data. “Basic modules” quickly inspect sequence quality, nucleotide composition bias, PCR bias and GC bias, while “RNA-se

3 KB (328 words) - 17:48, 10 June 2022
Barrnap
...more CPUs. Running time is approximately 1 second per 1 megabase of input sequence.

3 KB (328 words) - 17:55, 10 June 2022
ProtExcluder
...ude the portion of DNA sequence (as well as certain length of the flanking sequence – given by the user, default = 50 bp) matching subjects in a protein data

3 KB (329 words) - 21:22, 6 December 2019
Bridger
...r to search an optimal set of paths (transcripts) that can be supported by sequence data and could explain all observed splicing events of each locus.

3 KB (341 words) - 13:14, 15 August 2022
Last
* Handle big sequence data, e.g: * Use sequence quality data properly.

3 KB (342 words) - 19:29, 12 August 2022
Probalign
...terior probability estimates to compute maximum expected accuracy multiple sequence alignments. It performs statistically significantly better than the leading generation and manipulation of multiple sequence alignments using

3 KB (349 words) - 19:36, 21 August 2022
CGView
...w sequence features, gene and protein names, COG category assignments, and sequence composition characteristics. CCT can generate maps in a variety of sizes, i

3 KB (351 words) - 19:39, 23 May 2022
PhyloPhlAn2
...nction that selects how many position to consider for each of the multiple-sequence alignment.

3 KB (375 words) - 17:57, 9 June 2022
Jasper
...assemblies. JASPER is substantially faster than polishing methods based on sequence alignment, and more accurate than currently available k-mer based methods.

3 KB (354 words) - 16:42, 14 December 2023
EMBOSS
...BOSS also integrates a range of currently available packages and tools for sequence analysis into a seamless whole. EMBOSS breaks the historical trend towards

3 KB (372 words) - 15:01, 15 August 2022
SeqPrep
...most cases) so they are forcefully merged. When reads do not have adapter sequence they must be treated with care when doing the merging, so a much more sensi

3 KB (371 words) - 22:38, 21 August 2022
HybPiper
HybPiper was designed for targeted sequence capture, in which DNA sequencing libraries are enriched for gene regions of ...., Zerega, N. J. C, and Wickett, N. J. (2016). HybPiper: Extracting Coding Sequence and Introns for Phylogenetics from High-Throughput Sequencing Reads Using T

3 KB (349 words) - 18:36, 10 June 2022
Maq
Follow these steps to run Maq. All you need is a reference sequence file in the FASTA format. Prepare a reference sequence (ref.fasta), better a bacterial genome to make the test run faster.

3 KB (377 words) - 19:47, 12 August 2022
Viralmsa
ViralMSA is a tool to perform reference-guided multiple sequence alignment of viral genomes. ViralMSA wraps around existing read mapping too ...Moshiri N (2020). "ViralMSA: Massively scalable reference-guided multiple sequence alignment of viral genomes." Bioinformatics. btaa743. doi:10.1093/bioinform

3 KB (357 words) - 19:56, 27 May 2022
MODELLER
...alignment of protein sequences and/or structures, clustering, searching of sequence databases, comparison of protein structures, etc. MODELLER is available for

3 KB (364 words) - 19:53, 12 August 2022
Trim Galore
...h some added functionality to remove biased methylation positions for RRBS sequence files (for directional, non-directional (or paired-end) sequencing). It's m ...suitable for both ends of paired-end libraries), but accepts other adapter sequence, too

4 KB (512 words) - 20:52, 12 August 2022
RDnaTools
...is a python package of tools and pipelines for working with ribosomal DNA sequence data generated with the PacBio(R) SMRT sequencing. rDnaTools works by wrapp ...al Ecology community, and their existing tools for analyzing ribosomal DNA sequence data. Since the core of the analyses wrapped by rDnaTools come from the Mot

3 KB (361 words) - 21:24, 6 December 2019
Ssr pipeline
...quality standards, (2) align paired-end reads into a single composite DNA sequence, and (3) identify sequences that possess microsatellites conforming to user ...n of microsatellite sequences from paired-end Illumina High-Throughput DNA sequence data (ver. 1.1, February 2014): U.S. Geological Survey Data Series 778.

4 KB (501 words) - 18:55, 6 June 2022
Quake
..., which create false sequence unsimilar to anything in the original genome sequence from which the read was taken.

3 KB (367 words) - 18:50, 10 June 2022
Mvftools
...e data is encoded based on the information content at a particular aligned sequence site. This contextual encoding allows for rapid computation of phylogenetic

3 KB (370 words) - 13:58, 30 April 2020
Erpin
...write complex descriptors before starting a search. Instead ERPIN reads a sequence alignement and secondary structure, and automatically infers a statistical ...ert A. (2001) Direct RNA Motif Definition and Identification from Multiple Sequence Alignments using Secondary Structure Profiles. J Mol Biol. 313:1003-11]

3 KB (377 words) - 21:21, 6 December 2019
Samtools
...nce Alignment/Map) format is a generic format for storing large nucleotide sequence alignments. SAM Tools provide various utilities for manipulating alignments

3 KB (357 words) - 22:04, 21 August 2022
BiG-SCAPE
...ed on a comparison of their protein domain content, order, copy number and sequence identity.

3 KB (377 words) - 12:53, 15 August 2022
PROVEAN
PROVEAN is useful for filtering sequence variants to identify nonsynonymous or indel variants that are predicted to A fast computation approach to obtain pairwise sequence alignment scores enabled the generation of precomputed PROVEAN predictions

3 KB (350 words) - 20:27, 12 August 2022
MACH
available genotype and shotgun sequence data to estimate unobserved Li Y, Willer CJ, Ding J, Scheet P and Abecasis GR (2010) MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet Ep

3 KB (363 words) - 20:23, 15 August 2022
Anchorwave
...ragments, i.e., anchor and inter-anchor intervals. By performing sensitive sequence alignment for each shorter interval via a 2-piece affine gap cost strategy ...Michelle C. Stitzer. AnchorWave: Sensitive alignment of genomes with high sequence diversity, extensive structural polymorphism, and whole-genome duplication.

3 KB (367 words) - 21:10, 20 September 2023
FastANI
...avoids expensive sequence alignments and uses Mashmap as its MinHash based sequence mapping engine to compute the orthologous mappings and alignment identity e

3 KB (380 words) - 21:21, 6 December 2019
ShapeMapper
...grates careful handling of all classes of adduct-induced sequence changes, sequence variant correction, basecall quality filters, and quality-control warnings

3 KB (387 words) - 21:24, 6 December 2019
INDELible
INDELible is a new, portable, and flexible application for biological sequence simulation that combines many features in the same place for the first time ...tcher, W. and Yang, Z. 2009. INDELible: a flexible simulator of biological sequence evolution. Mol. Biol. and Evol. 2009 26(8):1879-1888]

3 KB (384 words) - 13:34, 28 June 2021
MOSAIK
...ikSort resolves paired-end reads and sorts the alignments by the reference sequence coordinates. Finally, MosaikText converts alignments to different text-base

3 KB (414 words) - 11:59, 19 August 2022
Ncbi-vdb
...sion by Reference, which only stores the differences in base pairs between sequence data and the segment it aligns to. The process to restore original data, fo

3 KB (428 words) - 14:33, 19 August 2022
Scrappie
* scrappie squiggle Create approximate squiggle for sequence * scrappie seqmappy Map signal to sequence via basecall posteriors

3 KB (416 words) - 13:43, 7 April 2021
TWINSCAN
with the length of the target sequence. A rough guideline is 1 GB of memory for 1 Mb of input sequence.

3 KB (386 words) - 20:20, 27 May 2022
TRF
...Repeats with pattern size in the range from 1 to 2000 bases are detected. Sequence information sent to the server is confidential and deleted after program ex

3 KB (442 words) - 20:52, 12 August 2022
SCRATCH-1D
* EVALpro release 1.0 (2019) : Evaluation of sequence-based & profile-based predictors * PROFILpro release 2.0 (2021) : Protein evolutionary information / sequence profiles

3 KB (381 words) - 20:35, 12 August 2022

Search results

Navigation menu

Search