» Articles » PMID: 39633440

Rore: Robust and Efficient Antioxidant Protein Classification Via a Novel Dimensionality Reduction Strategy Based on Learning of Fewer Features

Overview
Journal Genomics Inform
Publisher Biomed Central
Specialty Biology
Date 2024 Dec 5
PMID 39633440
Authors
Affiliations
Soon will be listed here.
Abstract

In protein identification, researchers increasingly aim to achieve efficient classification using fewer features. While many feature selection methods effectively reduce the number of model features, they often cause information loss caused by merely selecting or discarding features, which limits classifier performance. To address this issue, we present Rore, an algorithm based on a feature-dimensionality reduction strategy. By mapping the original features to a latent space, Rore retains all relevant feature information while using fewer representations of the latent features. This approach significantly preserves the original information and overcomes the information loss problem associated with previous feature selection. Through extensive experimental validation and analysis, Rore demonstrated excellent performance on an antioxidant protein dataset, achieving an accuracy of 95.88% and MCC of 91.78%, using vectors including only 15 features. The Rore algorithm is available online at http://112.124.26.17:8021/Rore .

References
1.
Wei L, Ye X, Xue Y, Sakurai T, Wei L . ATSE: a peptide toxicity predictor by exploiting structural and evolutionary information based on graph neural network and attention mechanism. Brief Bioinform. 2021; 22(5). DOI: 10.1093/bib/bbab041. View

2.
Jones D . Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol. 1999; 292(2):195-202. DOI: 10.1006/jmbi.1999.3091. View

3.
Yang Y, Gao D, Xie X, Qin J, Li J, Lin H . DeepIDC: A Prediction Framework of Injectable Drug Combination Based on Heterogeneous Information and Deep Learning. Clin Pharmacokinet. 2022; 61(12):1749-1759. DOI: 10.1007/s40262-022-01180-9. View

4.
Zhang D, Chen H, Zulfiqar H, Yuan S, Huang Q, Zhang Z . iBLP: An XGBoost-Based Predictor for Identifying Bioluminescent Proteins. Comput Math Methods Med. 2021; 2021:6664362. PMC: 7808816. DOI: 10.1155/2021/6664362. View

5.
Liu B, Gao X, Zhang H . BioSeq-Analysis2.0: an updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches. Nucleic Acids Res. 2019; 47(20):e127. PMC: 6847461. DOI: 10.1093/nar/gkz740. View