» Articles » PMID: 30157750

PseUI: Pseudouridine Sites Identification Based on RNA Sequence Information

Overview
Publisher Biomed Central
Specialty Biology
Date 2018 Aug 31
PMID 30157750
Citations 50
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Pseudouridylation is the most prevalent type of posttranscriptional modification in various stable RNAs of all organisms, which significantly affects many cellular processes that are regulated by RNA. Thus, accurate identification of pseudouridine (Ψ) sites in RNA will be of great benefit for understanding these cellular processes. Due to the low efficiency and high cost of current available experimental methods, it is highly desirable to develop computational methods for accurately and efficiently detecting Ψ sites in RNA sequences. However, the predictive accuracy of existing computational methods is not satisfactory and still needs improvement.

Results: In this study, we developed a new model, PseUI, for Ψ sites identification in three species, which are H. sapiens, S. cerevisiae, and M. musculus. Firstly, five different kinds of features including nucleotide composition (NC), dinucleotide composition (DC), pseudo dinucleotide composition (pseDNC), position-specific nucleotide propensity (PSNP), and position-specific dinucleotide propensity (PSDP) were generated based on RNA segments. Then, a sequential forward feature selection strategy was used to gain an effective feature subset with a compact representation but discriminative prediction power. Based on the selected feature subsets, we built our model by using a support vector machine (SVM). Finally, the generalization of our model was validated by both the jackknife test and independent validation tests on the benchmark datasets. The experimental results showed that our model is more accurate and stable than the previously published models. We have also provided a user-friendly web server for our model at http://zhulab.ahu.edu.cn/PseUI , and a brief instruction for the web server is provided in this paper. By using this instruction, the academic users can conveniently get their desired results without complicated calculations.

Conclusion: In this study, we proposed a new predictor, PseUI, to detect Ψ sites in RNA sequences. It is shown that our model outperformed the existing state-of-art models. It is expected that our model, PseUI, will become a useful tool for accurate identification of RNA Ψ sites.

Citing Articles

Bioinformatics for Inosine: Tools and Approaches to Trace This Elusive RNA Modification.

Bortoletto E, Rosani U Genes (Basel). 2024; 15(8).

PMID: 39202357 PMC: 11353476. DOI: 10.3390/genes15080996.


PseUpred-ELPSO Is an Ensemble Learning Predictor with Particle Swarm Optimizer for Improving the Prediction of RNA Pseudouridine Sites.

Wang X, Li P, Wang R, Gao X Biology (Basel). 2024; 13(4).

PMID: 38666860 PMC: 11048358. DOI: 10.3390/biology13040248.


Fuzzy kernel evidence Random Forest for identifying pseudouridine sites.

Chen M, Sun M, Su X, Tiwari P, Ding Y Brief Bioinform. 2024; 25(3).

PMID: 38622357 PMC: 11018548. DOI: 10.1093/bib/bbae169.


Interpretable Multi-Scale Deep Learning for RNA Methylation Analysis across Multiple Species.

Wang R, Chung C, Lee T Int J Mol Sci. 2024; 25(5).

PMID: 38474116 PMC: 10932270. DOI: 10.3390/ijms25052869.


PseU-ST: A new stacked ensemble-learning method for identifying RNA pseudouridine sites.

Zhang X, Wang S, Xie L, Zhu Y Front Genet. 2023; 14:1121694.

PMID: 36741328 PMC: 9892456. DOI: 10.3389/fgene.2023.1121694.


References
1.
Sukumar S, Zhu X, Ericksen S, Mitchell J . DBSI server: DNA binding site identifier. Bioinformatics. 2016; 32(18):2853-5. DOI: 10.1093/bioinformatics/btw315. View

2.
Liu Z, Xiao X, Yu D, Jia J, Qiu W, Chou K . pRNAm-PC: Predicting N(6)-methyladenosine sites in RNA sequences via physical-chemical properties. Anal Biochem. 2016; 497:60-7. DOI: 10.1016/j.ab.2015.12.017. View

3.
Tang Y, Chen Y, Canchaya C, Zhang Z . GANNPhos: a new phosphorylation site predictor based on a genetic algorithm integrated neural network. Protein Eng Des Sel. 2007; 20(8):405-12. DOI: 10.1093/protein/gzm035. View

4.
Wang Y, Zhang Q, Sun M, Guo D . High-accuracy prediction of bacterial type III secreted effectors based on position-specific amino acid composition profiles. Bioinformatics. 2011; 27(6):777-84. DOI: 10.1093/bioinformatics/btr021. View

5.
Cantara W, Crain P, Rozenski J, McCloskey J, Harris K, Zhang X . The RNA Modification Database, RNAMDB: 2011 update. Nucleic Acids Res. 2010; 39(Database issue):D195-201. PMC: 3013656. DOI: 10.1093/nar/gkq1028. View