» Articles » PMID: 33132490

CAUSAL INTERPRETATIONS OF BLACK-BOX MODELS

Overview
Journal J Bus Econ Stat
Date 2020 Nov 2
PMID 33132490
Citations 40
Authors
Affiliations
Soon will be listed here.
Abstract

The fields of machine learning and causal inference have developed many concepts, tools, and theory that are potentially useful for each other. Through exploring the possibility of extracting causal interpretations from black-box machine-trained models, we briefly review the languages and concepts in causal inference that may be interesting to machine learning researchers. We start with the curious observation that Friedman's partial dependence plot has exactly the same formula as Pearl's back-door adjustment and discuss three requirements to make causal interpretations: a model with good predictive performance, some domain knowledge in the form of a causal diagram and suitable visualization tools. We provide several illustrative examples and find some interesting and potentially causal relations using visualization tools for black-box models.

Citing Articles

A Resampling Approach for Causal Inference on Novel Two-Point Time-Series with Application to Identify Risk Factors for Type-2 Diabetes and Cardiovascular Disease.

Dai X, Mouti S, Vale M, Ray S, Bohn J, Goldberg L Stat Biosci. 2025; 17(1):78-131.

PMID: 40061216 PMC: 11889075. DOI: 10.1007/s12561-023-09390-w.


Prediction of buckling damage of steel equal angle structural members using hybrid machine learning techniques.

Ho N, Le T, Dinh T, Nguyen V Sci Rep. 2025; 15(1):4696.

PMID: 39922853 PMC: 11807133. DOI: 10.1038/s41598-025-87869-w.


Data science and automation in the process of theorizing: Machine learning's power of induction in the co-duction cycle.

Kolkman D, Lee G, van Witteloostuijn A PLoS One. 2024; 19(11):e0309318.

PMID: 39495739 PMC: 11534228. DOI: 10.1371/journal.pone.0309318.


A framework for identifying factors controlling cyanobacterium blooms by coupled CCM-ECCM Bayesian networks.

Tal O, Ostrovsky I, Gal G Ecol Evol. 2024; 14(6):e11475.

PMID: 38932972 PMC: 11199127. DOI: 10.1002/ece3.11475.


Combining machine learning with high-content imaging to infer ciprofloxacin susceptibility in isolates of Salmonella Typhimurium.

Tran T, Sridhar S, Reece S, Lunguya O, Jacobs J, Van Puyvelde S Nat Commun. 2024; 15(1):5074.

PMID: 38871710 PMC: 11176356. DOI: 10.1038/s41467-024-49433-4.


References
1.
Pearl J . Interpretation and identification of causal mediation. Psychol Methods. 2014; 19(4):459-81. DOI: 10.1037/a0036434. View

2.
Land K . Methods for national population forecasts: a review. J Am Stat Assoc. 1986; 81(396):888-901. DOI: 10.1080/01621459.1986.10478347. View

3.
Shortreed S, Laber E, Lizotte D, Stroup T, Pineau J, Murphy S . Informing sequential clinical decision-making through reinforcement learning: an empirical study. Mach Learn. 2011; 84(1-2):109-136. PMC: 3143507. DOI: 10.1007/s10994-010-5229-0. View

4.
Mooney S, Pejaver V . Big Data in Public Health: Terminology, Machine Learning, and Privacy. Annu Rev Public Health. 2017; 39:95-112. PMC: 6394411. DOI: 10.1146/annurev-publhealth-040617-014208. View

5.
Zhao Y, Zeng D, Rush A, Kosorok M . Estimating Individualized Treatment Rules Using Outcome Weighted Learning. J Am Stat Assoc. 2013; 107(449):1106-1118. PMC: 3636816. DOI: 10.1080/01621459.2012.695674. View