» Articles » PMID: 39684741

Metabolomics-Based Machine Learning Models Accurately Predict Breast Cancer Estrogen Receptor Status

Overview
Journal Int J Mol Sci
Publisher MDPI
Date 2024 Dec 17
PMID 39684741
Authors
Affiliations
Soon will be listed here.
Abstract

Breast cancer is a global concern as a leading cause of death for women. Early and precise diagnosis can be vital in handling the disease efficiently. Breast cancer subtyping based on estrogen receptor (ER) status is crucial for determining prognosis and treatment. This study uses metabolomics data from plasma samples to detect metabolite biomarkers that could distinguish ER-positive from ER-negative breast cancers in a non-invasive manner. The dataset includes demographic information, ER status, and metabolite levels from 188 breast cancer patients and 73 healthy controls. Recursive Feature Elimination (RFE) with a Random Forest (RF) classifier identified an optimal subset of 30 features-29 biomarkers and age-that achieved the highest area under the curve (AUC). To address the class imbalance, Gaussian noise-based augmentation and Adaptive Synthetic Oversampling (ADASYN) were applied, ensuring balanced representation during training. Four machine learning (ML) algorithms-Random Forest, Support Vector Classifier (SVC), XGBoost, and Logistic Regression (LR)-were evaluated using grid search. The Random Forest classifier emerged as the top performer, achieving an AUC of 0.95 and an accuracy of 93%. These results suggest that ML has great promise for identifying specific metabolites linked to ER expression, paving the development of a novel analytical tool that can minimize current challenges in identifying ER status, and improve the precision of breast cancer subtyping.

References
1.
Parise C, Caggiano V . Breast Cancer Survival Defined by the ER/PR/HER2 Subtypes and a Surrogate Classification according to Tumor Grade and Immunohistochemical Biomarkers. J Cancer Epidemiol. 2014; 2014:469251. PMC: 4058253. DOI: 10.1155/2014/469251. View

2.
Gal J, Bailleux C, Chardin D, Pourcher T, Gilhodes J, Jing L . Comparison of unsupervised machine-learning methods to identify metabolomic signatures in patients with localized breast cancer. Comput Struct Biotechnol J. 2020; 18:1509-1524. PMC: 7327012. DOI: 10.1016/j.csbj.2020.05.021. View

3.
Oma D, Teklemariam M, Seifu D, Desalegn Z, Anberbir E, Abebe T . Immunohistochemistry versus PCR Technology for Molecular Subtyping of Breast Cancer: Multicentered Expereinces from Addis Ababa, Ethiopia. J Cancer Prev. 2023; 28(2):64-74. PMC: 10331035. DOI: 10.15430/JCP.2023.28.2.64. View

4.
Alakwaa F, Chaudhary K, Garmire L . Deep Learning Accurately Predicts Estrogen Receptor Status in Breast Cancer Metabolomics Data. J Proteome Res. 2017; 17(1):337-347. PMC: 5759031. DOI: 10.1021/acs.jproteome.7b00595. View

5.
Khande T, Joshi A, Khandeparkar S, Kulkarni M, Gogate B, Kakade A . Study of ER, PR, HER2/neu, p53, and Ki67 expression in primary breast carcinomas and synchronous metastatic axillary lymph nodes. Indian J Cancer. 2020; 57(2):190-197. DOI: 10.4103/ijc.IJC_610_18. View