» Articles » PMID: 38666860

PseUpred-ELPSO Is an Ensemble Learning Predictor with Particle Swarm Optimizer for Improving the Prediction of RNA Pseudouridine Sites

Overview
Journal Biology (Basel)
Publisher MDPI
Specialty Biology
Date 2024 Apr 26
PMID 38666860
Authors
Affiliations
Soon will be listed here.
Abstract

RNA pseudouridine modification exists in different RNA types of many species, and it has a significant role in regulating the expression of biological processes. To understand the functional mechanisms for RNA pseudouridine sites, the accurate identification of pseudouridine sites in RNA sequences is essential. Although several fast and inexpensive computational methods have been proposed, the challenge of improving recognition accuracy and generalization still exists. This study proposed a novel ensemble predictor called PseUpred-ELPSO for improved RNA pseudouridine site prediction. After analyzing the nucleotide composition preferences between RNA pseudouridine site sequences, two feature representations were determined and fed into the stacking ensemble framework. Then, using five tree-based machine learning classifiers as base classifiers, 30-dimensional RNA profiles are constructed to represent RNA sequences, and using the PSO algorithm, the weights of the RNA profiles were searched to further enhance the representation. A logistic regression classifier was used as a meta-classifier to complete the final predictions. Compared to the most advanced predictors, the performance of PseUpred-ELPSO is superior in both cross-validation and the independent test. Based on the PseUpred-ELPSO predictor, a free and easy-to-operate web server has been established, which will be a powerful tool for pseudouridine site identification.

References
1.
Xuan J, Chen L, Chen Z, Pang J, Huang J, Lin J . RMBase v3.0: decode the landscape, mechanisms and functions of RNA modifications. Nucleic Acids Res. 2023; 52(D1):D273-D284. PMC: 10767931. DOI: 10.1093/nar/gkad1070. View

2.
Wu H, Wu Y, Jiang Y, Zhou B, Zhou H, Chen Z . scHiCStackL: a stacking ensemble learning-based method for single-cell Hi-C classification using cell embedding. Brief Bioinform. 2021; 23(1). DOI: 10.1093/bib/bbab396. View

3.
Lovejoy A, Riordan D, Brown P . Transcriptome-wide mapping of pseudouridines: pseudouridine synthases modify specific mRNAs in S. cerevisiae. PLoS One. 2014; 9(10):e110799. PMC: 4212993. DOI: 10.1371/journal.pone.0110799. View

4.
Jiang M, Shao Y, Zhang Y, Zhou W, Pang S . A deep learning method for drug-target affinity prediction based on sequence interaction information mining. PeerJ. 2023; 11:e16625. PMC: 10720480. DOI: 10.7717/peerj.16625. View

5.
Li X, Zhu P, Ma S, Song J, Bai J, Sun F . Chemical pulldown reveals dynamic pseudouridylation of the mammalian transcriptome. Nat Chem Biol. 2015; 11(8):592-7. DOI: 10.1038/nchembio.1836. View