Machine Learning-Based Interpretation and Visualization of Nonlinear Interactions in Prostate Cancer Survival
Overview
Affiliations
Purpose: Shapley additive explanation (SHAP) values represent a unified approach to interpreting predictions made by complex machine learning (ML) models, with superior consistency and accuracy compared with prior methods. We describe a novel application of SHAP values to the prediction of mortality risk in prostate cancer.
Methods: Patients with nonmetastatic, node-negative prostate cancer, diagnosed between 2004 and 2015, were identified using the National Cancer Database. Model features were specified a priori: age, prostate-specific antigen (PSA), Gleason score, percent positive cores (PPC), comorbidity score, and clinical T stage. We trained a gradient-boosted tree model and applied SHAP values to model predictions. Open-source libraries in Python 3.7 were used for all analyses.
Results: We identified 372,808 patients meeting the inclusion criteria. When analyzing the interaction between PSA and Gleason score, we demonstrated consistency with the literature using the example of low-PSA, high-Gleason prostate cancer, recently identified as a unique entity with a poor prognosis. When analyzing the PPC-Gleason score interaction, we identified a novel finding of stronger interaction effects in patients with Gleason ≥ 8 disease compared with Gleason 6-7 disease, particularly with PPC ≥ 50%. Subsequent confirmatory linear analyses supported this finding: 5-year overall survival in Gleason ≥ 8 patients was 87.7% with PPC < 50% versus 77.2% with PPC ≥ 50% ( < .001), compared with 89.1% versus 86.0% in Gleason 7 patients ( < .001), with a significant interaction term between PPC ≥ 50% and Gleason ≥ 8 ( < .001).
Conclusion: We describe a novel application of SHAP values for modeling and visualizing nonlinear interaction effects in prostate cancer. This ML-based approach is a promising technique with the potential to meaningfully improve risk stratification and staging systems.
Guay S, Charlebois-Plante C, Vinet S, Bourassa M, De Beaumont L Neurotrauma Rep. 2025; 6(1):136-147.
PMID: 39990705 PMC: 11839523. DOI: 10.1089/neur.2024.0094.
Lee J, Lin J, Lin W, Jan Y, Leu Y, Chen Y Eur Radiol. 2024; .
PMID: 39706923 DOI: 10.1007/s00330-024-11303-4.
Mortality Prediction Modeling for Patients with Breast Cancer Based on Explainable Machine Learning.
Park S, Park Y, Lee E, Chae H, Park P, Choi D Cancers (Basel). 2024; 16(22).
PMID: 39594754 PMC: 11592669. DOI: 10.3390/cancers16223799.
Chen P, Chiang P, Lin J, Tsai W, Lin W, Jan Y Eur Urol Open Sci. 2024; 70:99-108.
PMID: 39512868 PMC: 11541424. DOI: 10.1016/j.euros.2024.10.007.
Lee J, Lin J JACC CardioOncol. 2024; 6(5):772-774.
PMID: 39479328 PMC: 11520197. DOI: 10.1016/j.jaccao.2024.08.004.