» Articles » PMID: 26477633

Combining Fourier and Lagged K-nearest Neighbor Imputation for Biomedical Time Series Data

Overview
Journal J Biomed Inform
Publisher Elsevier
Date 2015 Oct 20
PMID 26477633
Citations 18
Authors
Affiliations
Soon will be listed here.
Abstract

Most clinical and biomedical data contain missing values. A patient's record may be split across multiple institutions, devices may fail, and sensors may not be worn at all times. While these missing values are often ignored, this can lead to bias and error when the data are mined. Further, the data are not simply missing at random. Instead the measurement of a variable such as blood glucose may depend on its prior values as well as that of other variables. These dependencies exist across time as well, but current methods have yet to incorporate these temporal relationships as well as multiple types of missingness. To address this, we propose an imputation method (FLk-NN) that incorporates time lagged correlations both within and across variables by combining two imputation methods, based on an extension to k-NN and the Fourier transform. This enables imputation of missing values even when all data at a time point is missing and when there are different types of missingness both within and across variables. In comparison to other approaches on three biological datasets (simulated and actual Type 1 diabetes datasets, and multi-modality neurological ICU monitoring) the proposed method has the highest imputation accuracy. This was true for up to half the data being missing and when consecutive missing values are a significant fraction of the overall time series length.

Citing Articles

Intraoperative circulation predict prolonged length of stay after head and neck free flap reconstruction: a retrospective study based on machine learning.

Liu Z, Wen J, Chen Y, Zhou B, Cao M, Guo M Front Oncol. 2025; 14:1473447.

PMID: 39868373 PMC: 11757266. DOI: 10.3389/fonc.2024.1473447.


Inflammatory burden index: associations between osteoarthritis and all-cause mortality among individuals with osteoarthritis.

Xiong Z, Xu W, Wang Y, Cao S, Zeng X, Yang P BMC Public Health. 2024; 24(1):2203.

PMID: 39138465 PMC: 11323649. DOI: 10.1186/s12889-024-19632-1.


Binned Data Provide Better Imputation of Missing Time Series Data from Wearables.

Chakrabarti S, Biswas N, Karnani K, Padul V, Jones L, Kesari S Sensors (Basel). 2023; 23(3).

PMID: 36772494 PMC: 9919790. DOI: 10.3390/s23031454.


Machine learning modeling practices to support the principles of AI and ethics in nutrition research.

Thomas D, Kleinberg S, Brown A, Crow M, Bastian N, Reisweber N Nutr Diabetes. 2022; 12(1):48.

PMID: 36456550 PMC: 9715415. DOI: 10.1038/s41387-022-00226-y.


Classification of Level of Consciousness in a Neurological ICU Using Physiological Data.

Gomez L, Shen Q, Doyle K, Vrosgou A, Velazquez A, Megjhani M Neurocrit Care. 2022; 38(1):118-128.

PMID: 36109448 PMC: 9935697. DOI: 10.1007/s12028-022-01586-0.


References
1.
He Y . Missing data analysis using multiple imputation: getting to the heart of the matter. Circ Cardiovasc Qual Outcomes. 2010; 3(1):98-105. PMC: 2818781. DOI: 10.1161/CIRCOUTCOMES.109.875658. View

2.
Sterne J, White I, Carlin J, Spratt M, Royston P, Kenward M . Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ. 2009; 338:b2393. PMC: 2714692. DOI: 10.1136/bmj.b2393. View

3.
Ouyang M, Welsh W, Georgopoulos P . Gaussian mixture clustering and imputation of microarray data. Bioinformatics. 2004; 20(6):917-23. DOI: 10.1093/bioinformatics/bth007. View

4.
Kenward M, Carpenter J . Multiple imputation: current perspectives. Stat Methods Med Res. 2007; 16(3):199-218. DOI: 10.1177/0962280206075304. View

5.
Feupe S, Frias P, Mednick S, McDevitt E, Heintzman N . Nocturnal continuous glucose and sleep stage data in adults with type 1 diabetes in real-world conditions. J Diabetes Sci Technol. 2013; 7(5):1337-45. PMC: 3876379. DOI: 10.1177/193229681300700525. View