» Articles » PMID: 32607472

From Local Explanations to Global Understanding with Explainable AI for Trees

Overview
Journal Nat Mach Intell
Publisher Springer Nature
Date 2020 Jul 2
PMID 32607472
Citations 1200
Authors
Affiliations
Soon will be listed here.
Abstract

Tree-based machine learning models such as random forests, decision trees, and gradient boosted trees are popular non-linear predictive models, yet comparatively little attention has been paid to explaining their predictions. Here, we improve the interpretability of tree-based models through three main contributions: 1) The first polynomial time algorithm to compute optimal explanations based on game theory. 2) A new type of explanation that directly measures local feature interaction effects. 3) A new set of tools for understanding global model structure based on combining many local explanations of each prediction. We apply these tools to three medical machine learning problems and show how combining many high-quality local explanations allows us to represent global structure while retaining local faithfulness to the original model. These tools enable us to i) identify high magnitude but low frequency non-linear mortality risk factors in the US population, ii) highlight distinct population sub-groups with shared risk characteristics, iii) identify non-linear interaction effects among risk factors for chronic kidney disease, and iv) monitor a machine learning model deployed in a hospital by identifying which features are degrading the model's performance over time. Given the popularity of tree-based machine learning models, these improvements to their interpretability have implications across a broad set of domains.

Citing Articles

KinasePred: A Computational Tool for Small-Molecule Kinase Target Prediction.

Di Stefano M, Piazza L, Poles C, Galati S, Granchi C, Giordano A Int J Mol Sci. 2025; 26(5).

PMID: 40076779 PMC: 11900317. DOI: 10.3390/ijms26052157.


Development of Clinical-Radiomics Nomogram for Predicting Post-Surgery Functional Improvement in High-Grade Glioma Patients.

Ius T, Polano M, Dal Bo M, Bagatto D, Bertani V, Gentilini D Cancers (Basel). 2025; 17(5).

PMID: 40075605 PMC: 11899258. DOI: 10.3390/cancers17050758.


Development and Validation of Machine Learning Models for Outcome Prediction in Patients with Poor-Grade Aneurysmal Subarachnoid Hemorrhage Following Endovascular Treatment.

Du S, Wu Y, Tao J, Shu L, Yan T, Xiao B Ther Clin Risk Manag. 2025; 21:293-307.

PMID: 40071129 PMC: 11895686. DOI: 10.2147/TCRM.S504745.


Shapley Fields Reveal Chemotopic Organization in the Mouse Olfactory Bulb Across Diverse Chemical Feature Sets.

Milicevic N, Burton S, Wachowiak M, Itskov V bioRxiv. 2025; .

PMID: 40060549 PMC: 11888437. DOI: 10.1101/2025.02.26.640432.


BacTermFinder: a comprehensive and general bacterial terminator finder using a CNN ensemble.

Taheri Ghahfarokhi S, Pena-Castillo L NAR Genom Bioinform. 2025; 7(1):lqaf016.

PMID: 40060369 PMC: 11890068. DOI: 10.1093/nargab/lqaf016.


References
1.
Jiang R, Tang W, Wu X, Fu W . A random forest approach to the detection of epistatic interactions in case-control studies. BMC Bioinformatics. 2009; 10 Suppl 1:S65. PMC: 2648748. DOI: 10.1186/1471-2105-10-S1-S65. View

2.
Sorlie T, Tibshirani R, Parker J, Hastie T, Marron J, Nobel A . Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci U S A. 2003; 100(14):8418-23. PMC: 166244. DOI: 10.1073/pnas.0932692100. View

3.
Cox C, Mussolino M, Rothwell S, Lane M, Golden C, Madans J . Plan and operation of the NHANES I Epidemiologic Followup Study, 1992. Vital Health Stat 1. 1998; (35):1-231. View

4.
Lapuschkin S, Waldchen S, Binder A, Montavon G, Samek W, Muller K . Unmasking Clever Hans predictors and assessing what machines really learn. Nat Commun. 2019; 10(1):1096. PMC: 6411769. DOI: 10.1038/s41467-019-08987-4. View

5.
Fan F, Jia J, Li J, Huo Y, Zhang Y . White blood cell count predicts the odds of kidney function decline in a Chinese community-based population. BMC Nephrol. 2017; 18(1):190. PMC: 5463367. DOI: 10.1186/s12882-017-0608-4. View