» Articles » PMID: 39098908

StackedEnC-AOP: Prediction of Antioxidant Proteins Using Transform Evolutionary and Sequential Features Based Multi-scale Vector with Stacked Ensemble Learning

Overview
Publisher Biomed Central
Specialty Biology
Date 2024 Aug 4
PMID 39098908
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Antioxidant proteins are involved in several biological processes and can protect DNA and cells from the damage of free radicals. These proteins regulate the body's oxidative stress and perform a significant role in many antioxidant-based drugs. The current invitro-based medications are costly, time-consuming, and unable to efficiently screen and identify the targeted motif of antioxidant proteins.

Methods: In this model, we proposed an accurate prediction method to discriminate antioxidant proteins namely StackedEnC-AOP. The training sequences are formulation encoded via incorporating a discrete wavelet transform (DWT) into the evolutionary matrix to decompose the PSSM-based images via two levels of DWT to form a Pseudo position-specific scoring matrix (PsePSSM-DWT) based embedded vector. Additionally, the Evolutionary difference formula and composite physiochemical properties methods are also employed to collect the structural and sequential descriptors. Then the combined vector of sequential features, evolutionary descriptors, and physiochemical properties is produced to cover the flaws of individual encoding schemes. To reduce the computational cost of the combined features vector, the optimal features are chosen using Minimum redundancy and maximum relevance (mRMR). The optimal feature vector is trained using a stacking-based ensemble meta-model.

Results: Our developed StackedEnC-AOP method reported a prediction accuracy of 98.40% and an AUC of 0.99 via training sequences. To evaluate model validation, the StackedEnC-AOP training model using an independent set achieved an accuracy of 96.92% and an AUC of 0.98.

Conclusion: Our proposed StackedEnC-AOP strategy performed significantly better than current computational models with a ~ 5% and ~ 3% improved accuracy via training and independent sets, respectively. The efficacy and consistency of our proposed StackedEnC-AOP make it a valuable tool for data scientists and can execute a key role in research academia and drug design.

Citing Articles

Early warning strategies for corporate operational risk: A study by an improved random forest algorithm using FCM clustering.

Fang X PLoS One. 2025; 20(3):e0318491.

PMID: 40067839 PMC: 11896059. DOI: 10.1371/journal.pone.0318491.


Smart waste classification in IoT-enabled smart cities using VGG16 and Cat Swarm Optimized random forest.

Gaurav A, Gupta B, Arya V, Attar R, Bansal S, Alhomoud A PLoS One. 2025; 20(2):e0316930.

PMID: 40019915 PMC: 11870384. DOI: 10.1371/journal.pone.0316930.


Explainable AI-driven prediction of APE1 inhibitors: enhancing cancer therapy with machine learning models and feature importance analysis.

Iqbal A, Masoodi T, Bhat A, Macha M, Assad A, Shah S Mol Divers. 2025; .

PMID: 39982681 DOI: 10.1007/s11030-025-11133-6.


Addressing imbalanced data classification with Cluster-Based Reduced Noise SMOTE.

Hemmatian J, Hajizadeh R, Nazari F PLoS One. 2025; 20(2):e0317396.

PMID: 39928607 PMC: 11809912. DOI: 10.1371/journal.pone.0317396.


XGBoost-enhanced ensemble model using discriminative hybrid features for the prediction of sumoylation sites.

Khan S, Noor S, Javed T, Naseem A, Aslam F, AlQahtani S BioData Min. 2025; 18(1):12.

PMID: 39901279 PMC: 11792219. DOI: 10.1186/s13040-024-00415-8.


References
1.
Nanni L, Brahnam S, Lumini A . Wavelet images and Chou's pseudo amino acid composition for protein classification. Amino Acids. 2011; 43(2):657-65. DOI: 10.1007/s00726-011-1114-9. View

2.
Cao Z, Pan X, Yang Y, Huang Y, Shen H . The lncLocator: a subcellular localization predictor for long non-coding RNAs based on a stacked ensemble classifier. Bioinformatics. 2018; 34(13):2185-2194. DOI: 10.1093/bioinformatics/bty085. View

3.
Fernandez-Blanco E, Aguiar-Pulido V, Munteanu C, Dorado J . Random Forest classification based on star graph topological indices for antioxidant proteins. J Theor Biol. 2012; 317:331-7. DOI: 10.1016/j.jtbi.2012.10.006. View

4.
Wachirattanapongmetee K, Katekaew S, Weerapreeyakul N, Thawornchinsombut S . Differentiation of protein types extracted from tilapia byproducts by FTIR spectroscopy combined with chemometric analysis and their antioxidant protein hydrolysates. Food Chem. 2023; 437(Pt 2):137862. DOI: 10.1016/j.foodchem.2023.137862. View

5.
Sun D, Liu Z, Mao X, Yang Z, Ji C, Liu Y . ANOX: A robust computational model for predicting the antioxidant proteins based on multiple features. Anal Biochem. 2021; 631:114257. DOI: 10.1016/j.ab.2021.114257. View