» Articles » PMID: 16677380

Identification of Consensus RNA Secondary Structures Using Suffix Arrays

Overview
Publisher Biomed Central
Specialty Biology
Date 2006 May 9
PMID 16677380
Citations 10
Authors
Affiliations
Soon will be listed here.
Abstract

Background: The identification of a consensus RNA motif often consists in finding a conserved secondary structure with minimum free energy in an ensemble of aligned sequences. However, an alignment is often difficult to obtain without prior structural information. Thus the need for tools to automate this process.

Results: We present an algorithm called Seed to identify all the conserved RNA secondary structure motifs in a set of unaligned sequences. The search space is defined as the set of all the secondary structure motifs inducible from a seed sequence. A general-to-specific search allows finding all the motifs that are conserved. Suffix arrays are used to enumerate efficiently all the biological palindromes as well as for the matching of RNA secondary structure expressions. We assessed the ability of this approach to uncover known structures using four datasets. The enumeration of the motifs relies only on the secondary structure definition and conservation only, therefore allowing for the independent evaluation of scoring schemes. Twelve simple objective functions based on free energy were evaluated for their potential to discriminate native folds from the rest.

Conclusion: Our evaluation shows that 1) support and exclusion constraints are sufficient to make an exhaustive search of the secondary structure space feasible. 2) The search space induced from a seed sequence contains known motifs. 3) Simple objective functions, consisting of a combination of the free energy of matching sequences, can generally identify motifs with high positive predictive value and sensitivity to known motifs.

Citing Articles

A novel method for the identification of conserved structural patterns in RNA: From small scale to high-throughput applications.

Pietrosanto M, Mattei E, Helmer-Citterich M, Ferre F Nucleic Acids Res. 2016; 44(18):8600-8609.

PMID: 27580722 PMC: 5062999. DOI: 10.1093/nar/gkw750.


RiboFSM: frequent subgraph mining for the discovery of RNA structures and interactions.

Gawronski A, Turcotte M BMC Bioinformatics. 2014; 15 Suppl 13:S2.

PMID: 25434643 PMC: 4248650. DOI: 10.1186/1471-2105-15-S13-S2.


Classification and assessment tools for structural motif discovery algorithms.

Badr G, Al-Turaiki I, Mathkour H BMC Bioinformatics. 2013; 14 Suppl 9:S4.

PMID: 23902564 PMC: 3698030. DOI: 10.1186/1471-2105-14-S9-S4.


CONS-COCOMAPS: a novel tool to measure and visualize the conservation of inter-residue contacts in multiple docking solutions.

Vangone A, Oliva R, Cavallo L BMC Bioinformatics. 2012; 13 Suppl 4:S19.

PMID: 22536965 PMC: 3434444. DOI: 10.1186/1471-2105-13-S4-S19.


PicXAA-R: efficient structural alignment of multiple RNA sequences using a greedy approach.

Sahraeian S, Yoon B BMC Bioinformatics. 2011; 12 Suppl 1:S38.

PMID: 21342569 PMC: 3044294. DOI: 10.1186/1471-2105-12-S1-S38.


References
1.
Mignone F, Gissi C, Liuni S, Pesole G . Untranslated regions of mRNAs. Genome Biol. 2002; 3(3):REVIEWS0004. PMC: 139023. DOI: 10.1186/gb-2002-3-3-reviews0004. View

2.
Lai E . RNA sensors and riboswitches: self-regulating messages. Curr Biol. 2003; 13(7):R285-91. DOI: 10.1016/s0960-9822(03)00203-3. View

3.
Nudler E, Mironov A . The riboswitch control of bacterial metabolism. Trends Biochem Sci. 2004; 29(1):11-7. DOI: 10.1016/j.tibs.2003.11.004. View

4.
Zuker M, Stiegler P . Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res. 1981; 9(1):133-48. PMC: 326673. DOI: 10.1093/nar/9.1.133. View

5.
Mathews D, Sabina J, Zuker M, Turner D . Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J Mol Biol. 1999; 288(5):911-40. DOI: 10.1006/jmbi.1999.2700. View