» Articles » PMID: 33624862

Missing Data: A Statistical Framework for Practice

Overview
Journal Biom J
Specialty Public Health
Date 2021 Feb 24
PMID 33624862
Citations 43
Authors
Affiliations
Soon will be listed here.
Abstract

Missing data are ubiquitous in medical research, yet there is still uncertainty over when restricting to the complete records is likely to be acceptable, when more complex methods (e.g. maximum likelihood, multiple imputation and Bayesian methods) should be used, how they relate to each other and the role of sensitivity analysis. This article seeks to address both applied practitioners and researchers interested in a more formal explanation of some of the results. For practitioners, the framework, illustrative examples and code should equip them with a practical approach to address the issues raised by missing data (particularly using multiple imputation), alongside an overview of how the various approaches in the literature relate. In particular, we describe how multiple imputation can be readily used for sensitivity analyses, which are still infrequently performed. For those interested in more formal derivations, we give outline arguments for key results, use simple examples to show how methods relate, and references for full details. The ideas are illustrated with a cohort study, a multi-centre case control study and a randomised clinical trial.

Citing Articles

Adherence to 24-hour movement guidelines and associations with mental well-being: a population-based study with adolescents in Canada.

Oberle E, Fan S, Molyneux T, Ji X, Brussoni M BMC Public Health. 2025; 25(1):749.

PMID: 40050844 PMC: 11884116. DOI: 10.1186/s12889-025-21857-7.


Effect of Tai Chi combined with music therapy on the cognitive function in older adult individuals with mild cognitive impairment.

Zhou C Front Public Health. 2025; 13:1475863.

PMID: 39935882 PMC: 11810935. DOI: 10.3389/fpubh.2025.1475863.


Bayesian semiparametric inference in longitudinal metabolomics data.

Sarkar A, Cominetti O, Montoliu I, Hosking J, Pinkney J, Martin F Sci Rep. 2024; 14(1):31336.

PMID: 39732846 PMC: 11682272. DOI: 10.1038/s41598-024-82718-8.


Is inverse probability of censoring weighting a safer choice than per-protocol analysis in clinical trials?.

Xuan J, Mt-Isa S, Latimer N, Gorrod H, Malbecq W, Vandormael K Stat Methods Med Res. 2024; 34(2):286-306.

PMID: 39668583 PMC: 11874582. DOI: 10.1177/09622802241289559.


Naloxone administration and survival in overdoses involving opioids and stimulants: An analysis of law enforcement data from 63 Pennsylvania counties.

Cano M, Jones A, Silverstein S, Daniulaityte R, LoVecchio F Int J Drug Policy. 2024; 135():104678.

PMID: 39637491 PMC: 11724750. DOI: 10.1016/j.drugpo.2024.104678.


References
1.
Morris T, White I, Royston P, Seaman S, Wood A . Multiple imputation for an incomplete covariate that is a ratio. Stat Med. 2013; 33(1):88-104. PMC: 3920636. DOI: 10.1002/sim.5935. View

2.
Chan A, Altman D . Epidemiology and reporting of randomised trials published in PubMed journals. Lancet. 2005; 365(9465):1159-62. DOI: 10.1016/S0140-6736(05)71879-1. View

3.
Carpenter J, Roger J, Kenward M . Analysis of longitudinal trials with protocol deviation: a framework for relevant, accessible assumptions, and inference via multiple imputation. J Biopharm Stat. 2013; 23(6):1352-71. DOI: 10.1080/10543406.2013.834911. View

4.
Schulz K, Altman D, Moher D . CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials. BMJ. 2010; 340:c332. PMC: 2844940. DOI: 10.1136/bmj.c332. View

5.
Shah A, Bartlett J, Carpenter J, Nicholas O, Hemingway H . Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study. Am J Epidemiol. 2014; 179(6):764-74. PMC: 3939843. DOI: 10.1093/aje/kwt312. View