» Articles » PMID: 39858475

Utilizing Feature Selection Techniques for AI-Driven Tumor Subtype Classification: Enhancing Precision in Cancer Diagnostics

Overview
Journal Biomolecules
Publisher MDPI
Date 2025 Jan 25
PMID 39858475
Authors
Affiliations
Soon will be listed here.
Abstract

Cancer's heterogeneity presents significant challenges in accurate diagnosis and effective treatment, including the complexity of identifying tumor subtypes and their diverse biological behaviors. This review examines how feature selection techniques address these challenges by improving the interpretability and performance of machine learning (ML) models in high-dimensional datasets. Feature selection methods-such as filter, wrapper, and embedded techniques-play a critical role in enhancing the precision of cancer diagnostics by identifying relevant biomarkers. The integration of multi-omics data and ML algorithms facilitates a more comprehensive understanding of tumor heterogeneity, advancing both diagnostics and personalized therapies. However, challenges such as ensuring data quality, mitigating overfitting, and addressing scalability remain critical limitations of these methods. Artificial intelligence (AI)-powered feature selection offers promising solutions to these issues by automating and refining the feature extraction process. This review highlights the transformative potential of these approaches while emphasizing future directions, including the incorporation of deep learning (DL) models and integrative multi-omics strategies for more robust and reproducible findings.

References
1.
Potlitz F, Link A, Schulig L . Advances in the discovery of new chemotypes through ultra-large library docking. Expert Opin Drug Discov. 2023; 18(3):303-313. DOI: 10.1080/17460441.2023.2171984. View

2.
Cai Z, Xu D, Zhang Q, Zhang J, Ngai S, Shao J . Classification of lung cancer using ensemble-based feature selection and machine learning methods. Mol Biosyst. 2014; 11(3):791-800. DOI: 10.1039/c4mb00659c. View

3.
Chen J, Dhahbi J . Lung adenocarcinoma and lung squamous cell carcinoma cancer classification, biomarker identification, and gene expression analysis using overlapping feature selection methods. Sci Rep. 2021; 11(1):13323. PMC: 8233431. DOI: 10.1038/s41598-021-92725-8. View

4.
Cantor E, Guauque-Olarte S, Leon R, Chabert S, Salas R . Knowledge-slanted random forest method for high-dimensional data and small sample size with a feature selection application for gene expression data. BioData Min. 2024; 17(1):34. PMC: 11389072. DOI: 10.1186/s13040-024-00388-8. View

5.
Wang J, Bo T, Jonassen I, Myklebost O, Hovig E . Tumor classification and marker gene prediction by feature selection and fuzzy c-means clustering using microarray data. BMC Bioinformatics. 2003; 4:60. PMC: 302113. DOI: 10.1186/1471-2105-4-60. View