» Articles » PMID: 19835626

Exploiting Structural and Topological Information to Improve Prediction of RNA-protein Binding Sites

Overview
Publisher Biomed Central
Specialty Biology
Date 2009 Oct 20
PMID 19835626
Citations 31
Authors
Affiliations
Soon will be listed here.
Abstract

Background: RNA-protein interactions are important for a wide range of biological processes. Current computational methods to predict interacting residues in RNA-protein interfaces predominately rely on sequence data. It is, however, known that interface residue propensity is closely correlated with structural properties. In this paper we systematically study information obtained from sequences and structures and compare their contributions in this prediction problem. Particularly, different geometrical and network topological properties of protein structures are evaluated to improve interface residue prediction accuracy.

Results: We have quantified the impact of structural information on the prediction accuracy in comparison to the purely sequence based approach using two machine learning techniques: Naïve Bayes classifiers and Support Vector Machines. The highest AUC of 0.83 was achieved by a Support Vector Machine, exploiting PSI-BLAST profile, accessible surface area, betweenness-centrality and retention coefficient as input features. Taking into account that our results are based on a larger non-redundant data set, the prediction accuracy is considerably higher than reported in previous, comparable studies. A protein-RNA interface predictor (PRIP) and the data set have been made available at http://www.qfab.org/PRIP.

Conclusion: Graph-theoretic properties of residue contact maps derived from protein structures such as betweenness-centrality can supplement sequence or structure features to improve the prediction accuracy for binding residues in RNA-protein interactions. While Support Vector Machines perform better on this task, Naïve Bayes classifiers also have been found to achieve good prediction accuracies but require much less training time and are an attractive choice for large scale predictions.

Citing Articles

Decoding the underlying mechanisms of Di-Tan-Decoction in treating intracerebral hemorrhage based on network pharmacology.

Zhen Z, Xue D, Chen Y, Li J, Gao Y, Shen Y BMC Complement Med Ther. 2023; 23(1):44.

PMID: 36765346 PMC: 9912606. DOI: 10.1186/s12906-022-03831-7.


PRIP: A Protein-RNA Interface Predictor Based on Semantics of Sequences.

Li Y, Lyu J, Wu Y, Liu Y, Huang G Life (Basel). 2022; 12(2).

PMID: 35207594 PMC: 8879494. DOI: 10.3390/life12020307.


PRIME-3D2D is a 3D2D model to predict binding sites of protein-RNA interaction.

Xie J, Zheng J, Hong X, Tong X, Liu S Commun Biol. 2020; 3(1):384.

PMID: 32678300 PMC: 7366699. DOI: 10.1038/s42003-020-1114-y.


Protein-RNA interactions: structural biology and computational modeling techniques.

Jones S Biophys Rev. 2017; 8(4):359-367.

PMID: 28510023 PMC: 5430296. DOI: 10.1007/s12551-016-0223-9.


RPI-Bind: a structure-based method for accurate identification of RNA-protein binding sites.

Luo J, Liu L, Venkateswaran S, Song Q, Zhou X Sci Rep. 2017; 7(1):614.

PMID: 28377624 PMC: 5429624. DOI: 10.1038/s41598-017-00795-4.


References
1.
Kim O, Yura K, Go N . Amino acid residue doublet propensity in the protein-RNA interface and its application to RNA interface prediction. Nucleic Acids Res. 2006; 34(22):6450-60. PMC: 1761430. DOI: 10.1093/nar/gkl819. View

2.
Cheng C, Su E, Hwang J, Sung T, Hsu W . Predicting RNA-binding sites of proteins using support vector machines and evolutionary information. BMC Bioinformatics. 2008; 9 Suppl 12:S6. PMC: 2638146. DOI: 10.1186/1471-2105-9-S12-S6. View

3.
Jeong E, Chung I, Miyano S . A neural network method for identification of RNA-interacting residues in protein. Genome Inform. 2005; 15(1):105-16. View

4.
Amitai G, Shemesh A, Sitbon E, Shklar M, Netanely D, Venger I . Network analysis of protein structures identifies functional residues. J Mol Biol. 2004; 344(4):1135-46. DOI: 10.1016/j.jmb.2004.10.055. View

5.
Bahadur R, Zacharias M, Janin J . Dissecting protein-RNA recognition sites. Nucleic Acids Res. 2008; 36(8):2705-16. PMC: 2377425. DOI: 10.1093/nar/gkn102. View