» Articles » PMID: 37509950

The Weight-Based Feature Selection (WBFS) Algorithm Classifies Lung Cancer Subtypes Using Proteomic Data

Overview
Journal Entropy (Basel)
Publisher MDPI
Date 2023 Jul 29
PMID 37509950
Authors
Affiliations
Soon will be listed here.
Abstract

Feature selection plays an important role in improving the performance of classification or reducing the dimensionality of high-dimensional datasets, such as high-throughput genomics/proteomics data in bioinformatics. As a popular approach with computational efficiency and scalability, information theory has been widely incorporated into feature selection. In this study, we propose a unique weight-based feature selection (WBFS) algorithm that assesses selected features and candidate features to identify the key protein biomarkers for classifying lung cancer subtypes from The Cancer Proteome Atlas (TCPA) database and we further explored the survival analysis between selected biomarkers and subtypes of lung cancer. Results show good performance of the combination of our WBFS method and Bayesian network for mining potential biomarkers. These candidate signatures have valuable biological significance in tumor classification and patient survival analysis. Taken together, this study proposes the WBFS method that helps to explore candidate biomarkers from biomedical datasets and provides useful information for tumor diagnosis or therapy strategies.

Citing Articles

Utilizing Feature Selection Techniques for AI-Driven Tumor Subtype Classification: Enhancing Precision in Cancer Diagnostics.

Wang J, Zhang Z, Wang Y Biomolecules. 2025; 15(1).

PMID: 39858475 PMC: 11763904. DOI: 10.3390/biom15010081.

References
1.
Nakariyakul S . A hybrid gene selection algorithm based on interaction information for microarray-based cancer classification. PLoS One. 2019; 14(2):e0212333. PMC: 6377117. DOI: 10.1371/journal.pone.0212333. View

2.
Kwak N, Choi C . Input feature selection for classification problems. IEEE Trans Neural Netw. 2008; 13(1):143-59. DOI: 10.1109/72.977291. View

3.
Llamedo M, Martinez J . Heartbeat classification using feature selection driven by database generalization criteria. IEEE Trans Biomed Eng. 2010; 58(3):616-25. DOI: 10.1109/TBME.2010.2068048. View

4.
Wang Y, Gao X, Ru X, Sun P, Wang J . A hybrid feature selection algorithm and its application in bioinformatics. PeerJ Comput Sci. 2022; 8:e933. PMC: 9044222. DOI: 10.7717/peerj-cs.933. View

5.
Yan H, Qu J, Cao W, Liu Y, Zheng G, Zhang E . Identification of prognostic genes in the acute myeloid leukemia immune microenvironment based on TCGA data analysis. Cancer Immunol Immunother. 2019; 68(12):1971-1978. PMC: 11028253. DOI: 10.1007/s00262-019-02408-7. View