» Articles » PMID: 39596491

Explainable Machine Learning Models Using Robust Cancer Biomarkers Identification from Paired Differential Gene Expression

Overview
Journal Int J Mol Sci
Publisher MDPI
Date 2024 Nov 27
PMID 39596491
Authors
Affiliations
Soon will be listed here.
Abstract

In oncology, there is a critical need for robust biomarkers that can be easily translated into the clinic. We introduce a novel approach using paired differential gene expression analysis for biological feature selection in machine learning models, enhancing robustness and interpretability while accounting for patient variability. This method compares primary tumor tissue with the same patient's healthy tissue, improving gene selection by eliminating individual-specific artifacts. A focus on carcinoma was selected due to its prevalence and the availability of the data; we aim to identify biomarkers involved in general carcinoma progression, including less-researched types. Our findings identified 27 pivotal genes that can distinguish between healthy and carcinoma tissue, even in unseen carcinoma types. Additionally, the panel could precisely identify the tissue-of-origin in the eight carcinoma types used in the discovery phase. Notably, in a proof of concept, the model accurately identified the primary tissue origin in metastatic samples despite limited sample availability. Functional annotation reveals these genes' involvement in cancer hallmarks, detecting subtle variations across carcinoma types. We propose paired differential gene expression analysis as a reference method for the discovering of robust biomarkers.

References
1.
Kumar V, Vashishta M, Kong L, Wu X, Lu J, Guha C . The Role of Notch, Hedgehog, and Wnt Signaling Pathways in the Resistance of Tumors to Anticancer Therapies. Front Cell Dev Biol. 2021; 9:650772. PMC: 8100510. DOI: 10.3389/fcell.2021.650772. View

2.
Sidak D, Schwarzerova J, Weckwerth W, Waldherr S . Interpretable machine learning methods for predictions in systems biology from omics data. Front Mol Biosci. 2022; 9:926623. PMC: 9650551. DOI: 10.3389/fmolb.2022.926623. View

3.
Xue J, Liu Y, Wan L, Zhu Y . Comprehensive Analysis of Differential Gene Expression to Identify Common Gene Signatures in Multiple Cancers. Med Sci Monit. 2020; 26:e919953. PMC: 7027371. DOI: 10.12659/MSM.919953. View

4.
Pico A, Kelder T, van Iersel M, Hanspers K, Conklin B, Evelo C . WikiPathways: pathway editing for the people. PLoS Biol. 2008; 6(7):e184. PMC: 2475545. DOI: 10.1371/journal.pbio.0060184. View

5.
Milacic M, Beavers D, Conley P, Gong C, Gillespie M, Griss J . The Reactome Pathway Knowledgebase 2024. Nucleic Acids Res. 2023; 52(D1):D672-D678. PMC: 10767911. DOI: 10.1093/nar/gkad1025. View