» Articles » PMID: 33499782

A Biologist's Guide to Model Selection and Causal Inference

Overview
Journal Proc Biol Sci
Specialty Biology
Date 2021 Jan 27
PMID 33499782
Citations 29
Authors
Affiliations
Soon will be listed here.
Abstract

A goal of many research programmes in biology is to extract meaningful insights from large, complex datasets. Researchers in ecology, evolution and behavior (EEB) often grapple with long-term, observational datasets from which they construct models to test causal hypotheses about biological processes. Similarly, epidemiologists analyse large, complex observational datasets to understand the distribution and determinants of human health. A key difference in the analytical workflows for these two distinct areas of biology is the delineation of data analysis tasks and explicit use of causal directed acyclic graphs (DAGs), widely adopted by epidemiologists. Here, we review the most recent causal inference literature and describe an analytical workflow that has direct applications for EEB. We start this commentary by defining four distinct analytical tasks (description, prediction, association, causal inference). The remainder of the text is dedicated to causal inference, specifically focusing on the use of DAGs to inform the modelling strategy. Given the increasing interest in causal inference and misperceptions regarding this task, we seek to facilitate an exchange of ideas between disciplinary silos and provide an analytical framework that is particularly relevant for making causal inference from observational data.

Citing Articles

Experience sampling method studies in physical activity research: the relevance of causal reasoning.

Poppe L, De Paepe A, Deforche B, Van Dyck D, Loeys T, Van Cauwenberg J Int J Behav Nutr Phys Act. 2025; 22(1):28.

PMID: 40045348 PMC: 11884166. DOI: 10.1186/s12966-025-01723-w.


Innovation in ant larval feeding facilitated queen-worker divergence and social complexity.

Matte A, LeBoeuf A Proc Natl Acad Sci U S A. 2025; 122(9):e2413742122.

PMID: 39999174 PMC: 11892636. DOI: 10.1073/pnas.2413742122.


Determining interaction directionality in complex biochemical networks from stationary measurements.

Leibovich N Sci Rep. 2025; 15(1):3004.

PMID: 39849082 PMC: 11758029. DOI: 10.1038/s41598-025-86332-0.


Causal Inference With Observational Data and Unobserved Confounding Variables.

Byrnes J, Dee L Ecol Lett. 2025; 28(1):e70023.

PMID: 39836442 PMC: 11750058. DOI: 10.1111/ele.70023.


The effectiveness of harvest for limiting wildlife disease: Insights from 20 years of chronic wasting disease in Wyoming.

Moss W, Binfet J, Hall L, Allen S, Edwards W, Jennings-Gaines J Ecol Appl. 2025; 35(1):e3089.

PMID: 39835473 PMC: 11748107. DOI: 10.1002/eap.3089.


References
1.
Gruber S, Logan R, Jarrin I, Monge S, Hernan M . Ensemble learning of inverse probability weights for marginal structural modeling in large observational datasets. Stat Med. 2014; 34(1):106-17. PMC: 4262745. DOI: 10.1002/sim.6322. View

2.
Coffman D, Zhong W . Assessing mediation using marginal structural models in the presence of confounding and moderation. Psychol Methods. 2012; 17(4):642-64. PMC: 3553264. DOI: 10.1037/a0029311. View

3.
Glymour M, Weuve J, Berkman L, Kawachi I, Robins J . When is baseline adjustment useful in analyses of change? An example with education and cognitive change. Am J Epidemiol. 2005; 162(3):267-78. DOI: 10.1093/aje/kwi187. View

4.
Greenland S, Pearl J, Robins J . Causal diagrams for epidemiologic research. Epidemiology. 1999; 10(1):37-48. View

5.
Schisterman E, Cole S, Platt R . Overadjustment bias and unnecessary adjustment in epidemiologic studies. Epidemiology. 2009; 20(4):488-95. PMC: 2744485. DOI: 10.1097/EDE.0b013e3181a819a1. View