» Articles » PMID: 23138266

De Novo Prediction of RNA-protein Interactions from Sequence Information

Overview
Journal Mol Biosyst
Date 2012 Nov 10
PMID 23138266
Citations 44
Authors
Affiliations
Soon will be listed here.
Abstract

Protein-RNA interactions are fundamentally important in understanding cellular processes. In particular, non-coding RNA-protein interactions play an important role to facilitate biological functions in signalling, transcriptional regulation, and even the progression of complex diseases. However, experimental determination of protein-RNA interactions remains time-consuming and labour-intensive. Here, we develop a novel extended naïve-Bayes-classifier for de novo prediction of protein-RNA interactions, only using protein and RNA sequence information. Specifically, we first collect a set of known protein-RNA interactions as gold-standard positives and extract sequence-based features to represent each protein-RNA pair. To fill the gap between high dimensional features and scarcity of gold-standard positives, we select effective features by cutting a likelihood ratio score, which not only reduces the computational complexity but also allows transparent feature integration during prediction. An extended naïve Bayes classifier is then constructed using these effective features to train a protein-RNA interaction prediction model. Numerical experiments show that our method can achieve the prediction accuracy of 0.77 even though only a small number of protein-RNA interaction data are available. In particular, we demonstrate that the extended naïve-Bayes-classifier is superior to the naïve-Bayes-classifier by fully considering the dependences among features. Importantly, we conduct ncRNA pull-down experiments to validate the predicted novel protein-RNA interactions and identify the interacting proteins of sbRNA CeN72 in C. elegans, which further demonstrates the effectiveness of our method.

Citing Articles

CBIL-VHPLI: a model for predicting viral-host protein-lncRNA interactions based on machine learning and transfer learning.

Zhang M, Zhang L, Liu T, Feng H, He Z, Li F Sci Rep. 2024; 14(1):17549.

PMID: 39080344 PMC: 11289117. DOI: 10.1038/s41598-024-68750-8.


LPIH2V: LncRNA-protein interactions prediction using HIN2Vec based on heterogeneous networks model.

Wei M, Yu C, Li L, You Z, Ren Z, Guan Y Front Genet. 2023; 14:1122909.

PMID: 36845392 PMC: 9950107. DOI: 10.3389/fgene.2023.1122909.


BoT-Net: a lightweight bag of tricks-based neural network for efficient LncRNA-miRNA interaction prediction.

Asim M, Ibrahim M, Zehe C, Trygg J, Dengel A, Ahmed S Interdiscip Sci. 2022; 14(4):841-862.

PMID: 35947255 PMC: 9581873. DOI: 10.1007/s12539-022-00535-x.


RBP-TSTL is a two-stage transfer learning framework for genome-scale prediction of RNA-binding proteins.

Peng X, Wang X, Guo Y, Ge Z, Li F, Gao X Brief Bioinform. 2022; 23(4).

PMID: 35649392 PMC: 9294422. DOI: 10.1093/bib/bbac215.


SAWRPI: A Stacking Ensemble Framework With Adaptive Weight for Predicting ncRNA-Protein Interactions Using Sequence Information.

Ren Z, Yu C, Li L, You Z, Guan Y, Li Y Front Genet. 2022; 13:839540.

PMID: 35360836 PMC: 8963817. DOI: 10.3389/fgene.2022.839540.