» Articles » PMID: 26855945

Missing Data Imputation: Focusing on Single Imputation

Overview
Journal Ann Transl Med
Date 2016 Feb 9
PMID 26855945
Citations 122
Authors
Affiliations
Soon will be listed here.
Abstract

Complete case analysis is widely used for handling missing data, and it is the default method in many statistical packages. However, this method may introduce bias and some useful information will be omitted from analysis. Therefore, many imputation methods are developed to make gap end. The present article focuses on single imputation. Imputations with mean, median and mode are simple but, like complete case analysis, can introduce bias on mean and deviation. Furthermore, they ignore relationship with other variables. Regression imputation can preserve relationship between missing values and other variables. There are many sophisticated methods exist to handle missing values in longitudinal data. This article focuses primarily on how to implement R code to perform single imputation, while avoiding complex mathematical calculations.

Citing Articles

Optimization of school physical education schedules to enhance long-term public health outcomes.

Tao S, Sheng-Ping Z, Meng-Yuan W Front Public Health. 2025; 13:1548056.

PMID: 40046114 PMC: 11879960. DOI: 10.3389/fpubh.2025.1548056.


Hinge-FM2I: an approach using image inpainting for interpolating missing data in univariate time series.

Noufel S, Maaroufi N, Najib M, Bakhouya M Sci Rep. 2025; 15(1):5389.

PMID: 39948363 PMC: 11825853. DOI: 10.1038/s41598-025-86382-4.


Study on influencing factors of age-adjusted Charlson comorbidity index in patients with Alzheimer's disease based on machine learning model.

Ding J, Long Z, Liu Y, Wang M Front Med (Lausanne). 2025; 12:1497662.

PMID: 39931556 PMC: 11807998. DOI: 10.3389/fmed.2025.1497662.


Medication-related hospitalisations in patients with SLE.

Stanciu M, Lee J, McDonald E, Clark G, Pineau C, Kalache F Lupus Sci Med. 2025; 12(1).

PMID: 39884714 PMC: 11784163. DOI: 10.1136/lupus-2024-001362.


Development and validation of a prognostic model for critically ill type 2 diabetes patients in ICU based on composite inflammatory indicators.

Liu L, Zhao Y, Cheng Z, Li Y, Liu Y Sci Rep. 2025; 15(1):3627.

PMID: 39880877 PMC: 11779909. DOI: 10.1038/s41598-025-87731-z.


References
1.
Twisk J, de Vente W . Attrition in longitudinal studies. How to deal with missing data. J Clin Epidemiol. 2002; 55(4):329-37. DOI: 10.1016/s0895-4356(01)00476-0. View

2.
Bell M, Fiero M, Horton N, Hsu C . Handling missing data in RCTs; a review of the top medical journals. BMC Med Res Methodol. 2014; 14:118. PMC: 4247714. DOI: 10.1186/1471-2288-14-118. View

3.
van der Heijden G, Donders A, Stijnen T, Moons K . Imputation of missing values is superior to complete case analysis and the missing-indicator method in multivariable diagnostic research: a clinical example. J Clin Epidemiol. 2006; 59(10):1102-9. DOI: 10.1016/j.jclinepi.2006.01.015. View

4.
Masconi K, Matsha T, Erasmus R, Kengne A . Effects of Different Missing Data Imputation Techniques on the Performance of Undiagnosed Diabetes Risk Prediction Models in a Mixed-Ancestry Population of South Africa. PLoS One. 2015; 10(9):e0139210. PMC: 4583496. DOI: 10.1371/journal.pone.0139210. View

5.
Demissie S, LaValley M, Horton N, Glynn R, Cupples L . Bias due to missing exposure data using complete-case analysis in the proportional hazards regression model. Stat Med. 2003; 22(4):545-57. DOI: 10.1002/sim.1340. View