» Articles » PMID: 37446311

Using Machine Learning Methods to Study Colorectal Cancer Tumor Micro-Environment and Its Biomarkers

Overview
Journal Int J Mol Sci
Publisher MDPI
Date 2023 Jul 14
PMID 37446311
Authors
Affiliations
Soon will be listed here.
Abstract

Colorectal cancer (CRC) is a leading cause of cancer deaths worldwide, and the identification of biomarkers can improve early detection and personalized treatment. In this study, RNA-seq data and gene chip data from TCGA and GEO were used to explore potential biomarkers for CRC. The SMOTE method was used to address class imbalance, and four feature selection algorithms (MCFS, Borota, mRMR, and LightGBM) were used to select genes from the gene expression matrix. Four machine learning algorithms (SVM, XGBoost, RF, and kNN) were then employed to obtain the optimal number of genes for model construction. Through interpretable machine learning (IML), co-predictive networks were generated to identify rules and uncover underlying relationships among the selected genes. Survival analysis revealed that , , , , and were significantly correlated with prognosis in CRC patients. In addition, the CIBERSORT algorithm was used to investigate the proportion of immune cells in CRC tissues, and gene mutation rates for the five selected biomarkers were explored. The biomarkers identified in this study have significant implications for the development of personalized therapies and could ultimately lead to improved clinical outcomes for CRC patients.

Citing Articles

Machine Learning-Enabled Non-Invasive Screening of Tumor-Associated Circulating Transcripts for Early Detection of Colorectal Cancer.

Han J, Park S, Kim L, Chung S, Kim T, Lee J Int J Mol Sci. 2025; 26(4).

PMID: 40003943 PMC: 11855660. DOI: 10.3390/ijms26041477.


Clinical Validation of a Machine Learning-Based Biomarker Signature to Predict Response to Cytotoxic Chemotherapy Alone or Combined with Targeted Therapy in Metastatic Colorectal Cancer Patients: A Study Protocol and Review.

Pagano D, Barresi V, Tropea A, Galvano A, Bazan V, Caldarella A Life (Basel). 2025; 15(2).

PMID: 40003728 PMC: 11857289. DOI: 10.3390/life15020320.


Integration of transcriptomics and machine learning for insights into breast cancer: exploring lipid metabolism and immune interactions.

Chen X, Yi J, Xie L, Liu T, Liu B, Yan M Front Immunol. 2024; 15:1470167.

PMID: 39524444 PMC: 11543460. DOI: 10.3389/fimmu.2024.1470167.


Identification of the m6A/m5C/m1A methylation modification genes in Alzheimer's disease based on bioinformatic analysis.

Tan Q, Zhou D, Guo Y, Chen H, Xie P Aging (Albany NY). 2024; 16(21):13340-13355.

PMID: 39485681 PMC: 11719101. DOI: 10.18632/aging.206146.


Identifying Explainable Machine Learning Models and a Novel SFRP2 Fibroblast Signature as Predictors for Precision Medicine in Ovarian Cancer.

Yang Z, Zhou D, Huang J Int J Mol Sci. 2023; 24(23).

PMID: 38069266 PMC: 10706905. DOI: 10.3390/ijms242316942.

References
1.
Tang Z, Li C, Kang B, Gao G, Li C, Zhang Z . GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses. Nucleic Acids Res. 2017; 45(W1):W98-W102. PMC: 5570223. DOI: 10.1093/nar/gkx247. View

2.
Jiang D, Liao J, Duan H, Wu Q, Owen G, Shu C . A machine learning-based prognostic predictor for stage III colon cancer. Sci Rep. 2020; 10(1):10333. PMC: 7316723. DOI: 10.1038/s41598-020-67178-0. View

3.
Zhao K, Yi Y, Ma Z, Zhang W . is a Prognostic Biomarker and Correlated With Immune Cell Infiltration in Cervical Cancer. Front Genet. 2022; 12:705512. PMC: 8764128. DOI: 10.3389/fgene.2021.705512. View

4.
Li W, Yin Y, Quan X, Zhang H . Gene Expression Value Prediction Based on XGBoost Algorithm. Front Genet. 2019; 10:1077. PMC: 6861218. DOI: 10.3389/fgene.2019.01077. View

5.
Clough E, Barrett T . The Gene Expression Omnibus Database. Methods Mol Biol. 2016; 1418:93-110. PMC: 4944384. DOI: 10.1007/978-1-4939-3578-9_5. View