» Articles » PMID: 32471366

How Are Missing Data in Covariates Handled in Observational Time-to-event Studies in Oncology? A Systematic Review

Overview
Publisher Biomed Central
Date 2020 May 31
PMID 32471366
Citations 20
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Missing data in covariates can result in biased estimates and loss of power to detect associations. It can also lead to other challenges in time-to-event analyses including the handling of time-varying effects of covariates, selection of covariates and their flexible modelling. This review aims to describe how researchers approach time-to-event analyses with missing data.

Methods: Medline and Embase were searched for observational time-to-event studies in oncology published from January 2012 to January 2018. The review focused on proportional hazards models or extended Cox models. We investigated the extent and reporting of missing data and how it was addressed in the analysis. Covariate modelling and selection, and assessment of the proportional hazards assumption were also investigated, alongside the treatment of missing data in these procedures.

Results: 148 studies were included. The mean proportion of individuals with missingness in any covariate was 32%. 53% of studies used complete-case analysis, and 22% used multiple imputation. In total, 14% of studies stated an assumption concerning missing data and only 34% stated missingness as a limitation. The proportional hazards assumption was checked in 28% of studies, of which, 17% did not state the assessment method. 58% of 144 multivariable models stated their covariate selection procedure with use of a pre-selected set of covariates being the most popular followed by stepwise methods and univariable analyses. Of 69 studies that included continuous covariates, 81% did not assess the appropriateness of the functional form.

Conclusion: While guidelines for handling missing data in epidemiological studies are in place, this review indicates that few report implementing recommendations in practice. Although missing data are present in many studies, we found that few state clearly how they handled it or the assumptions they have made. Easy-to-implement but potentially biased approaches such as complete-case analysis are most commonly used despite these relying on strong assumptions and where often more appropriate methods should be employed. Authors should be encouraged to follow existing guidelines to address missing data, and increased levels of expectation from journals and editors could be used to improve practice.

Citing Articles

Excess weight by degree and duration and cancer risk (ABACus2 consortium): a cohort study and individual participant data meta-analysis.

Hawwash N, Sperrin M, Martin G, Sinha R, Matthews C, Ricceri F EClinicalMedicine. 2024; 78():102921.

PMID: 39640936 PMC: 11617392. DOI: 10.1016/j.eclinm.2024.102921.


Moving Beyond Medical Statistics: A Systematic Review on Missing Data Handling in Electronic Health Records.

Ren W, Liu Z, Wu Y, Zhang Z, Hong S, Liu H Health Data Sci. 2024; 4:0176.

PMID: 39635227 PMC: 11615160. DOI: 10.34133/hds.0176.


A novel MissForest-based missing values imputation approach with recursive feature elimination in medical applications.

Hu Y, Wu R, Lin Y, Lin T BMC Med Res Methodol. 2024; 24(1):269.

PMID: 39516783 PMC: 11546113. DOI: 10.1186/s12874-024-02392-2.


Gaps in the usage and reporting of multiple imputation for incomplete data: findings from a scoping review of observational studies addressing causal questions.

Mainzer R, Moreno-Betancur M, Nguyen C, Simpson J, Carlin J, Lee K BMC Med Res Methodol. 2024; 24(1):193.

PMID: 39232661 PMC: 11373423. DOI: 10.1186/s12874-024-02302-6.


Addressing immortal time bias in precision medicine: Practical guidance and methods development.

Weymann D, Krebs E, Regier D Health Serv Res. 2024; 60(1):e14376.

PMID: 39225454 PMC: 11782076. DOI: 10.1111/1475-6773.14376.


References
1.
Vogiatzoglou A, Mulligan A, Bhaniani A, Lentjes M, McTaggart A, Luben R . Associations between flavan-3-ol intake and CVD risk in the Norfolk cohort of the European Prospective Investigation into Cancer (EPIC-Norfolk). Free Radic Biol Med. 2015; 84:1-10. PMC: 4503814. DOI: 10.1016/j.freeradbiomed.2015.03.005. View

2.
Keogh R, Morris T . Multiple imputation in Cox regression when there are time-varying effects of covariates. Stat Med. 2018; 37(25):3661-3678. PMC: 6220767. DOI: 10.1002/sim.7842. View

3.
White I, Royston P . Imputing missing covariate values for the Cox model. Stat Med. 2009; 28(15):1982-98. PMC: 2998703. DOI: 10.1002/sim.3618. View

4.
Seaman S, Bartlett J, White I . Multiple imputation of missing covariates with non-linear effects and interactions: an evaluation of statistical methods. BMC Med Res Methodol. 2012; 12:46. PMC: 3403931. DOI: 10.1186/1471-2288-12-46. View

5.
Heinzl H, Kaider A . Gaining more flexibility in Cox proportional hazards regression models with cubic spline functions. Comput Methods Programs Biomed. 1998; 54(3):201-8. DOI: 10.1016/s0169-2607(97)00043-6. View