» Articles » PMID: 24989859

Searching for Repeats, As an Example of Using the Generalised Ruzzo-Tompa Algorithm to Find Optimal Subsequences with Gaps

Overview
Specialty Biology
Date 2014 Jul 4
PMID 24989859
Citations 1
Authors
Affiliations
Soon will be listed here.
Abstract

Some biological sequences contain subsequences of unusual composition; e.g. some proteins contain DNA binding domains, transmembrane regions and charged regions, and some DNA sequences contain repeats. The linear-time Ruzzo-Tompa (RT) algorithm finds subsequences of unusual composition, using a sequence of scores as input and the corresponding 'maximal segments' as output. In principle, permitting gaps in the output subsequences could improve sensitivity. Here, the input of the RT algorithm is generalised to a finite, totally ordered, weighted graph, so the algorithm locates paths of maximal weight through increasing but not necessarily adjacent vertices. By permitting the penalised deletion of unfavourable letters, the generalisation therefore includes gaps. The program RepWords, which finds inexact simple repeats in DNA, exemplifies the general concepts by out-performing a similar extant, ad hoc tool. With minimal programming effort, the generalised Ruzzo-Tompa algorithm could improve the performance of many programs for finding biological subsequences of unusual composition.

Citing Articles

Most of the tight positional conservation of transcription factor binding sites near the transcription start site reflects their co-localization within regulatory modules.

Acevedo-Luna N, Marino-Ramirez L, Halbert A, Hansen U, Landsman D, Spouge J BMC Bioinformatics. 2016; 17(1):479.

PMID: 27871221 PMC: 5117513. DOI: 10.1186/s12859-016-1354-5.

References
1.
Frith M . Gentle masking of low-complexity sequences improves homology search. PLoS One. 2011; 6(12):e28819. PMC: 3242753. DOI: 10.1371/journal.pone.0028819. View

2.
Jurka J, Kapitonov V, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J . Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res. 2005; 110(1-4):462-7. DOI: 10.1159/000084979. View

3.
Feschotte C . Transposable elements and the evolution of regulatory networks. Nat Rev Genet. 2008; 9(5):397-405. PMC: 2596197. DOI: 10.1038/nrg2337. View

4.
Benson G . Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1998; 27(2):573-80. PMC: 148217. DOI: 10.1093/nar/27.2.573. View

5.
Huda A, Marino-Ramirez L, Jordan I . Epigenetic histone modifications of human transposable elements: genome defense versus exaptation. Mob DNA. 2010; 1(1):2. PMC: 2836006. DOI: 10.1186/1759-8753-1-2. View