On the Estimation of Inverse-probability-of-censoring Weights for the Evaluation of Survival Prediction Error

Overview

Journal PLoS One

Date 2025 Jan 31

PMID 39888901

Authors

Thomas Prince

Andrea Bommert

Jorg Rahnenfuhrer

Matthias Schmid

Affiliations

Soon will be listed here.

Abstract

Inverse probability weighting (IPW) is a popular method for making inferences regarding unobserved or unobservable data of a target population based on observed data. This paper considers IPW applied to right-censored time-to-event data. We investigate the behavior of the inverse-probability-of-censoring weighted (IPCW) Brier score, which is frequently used to assess the predictive performance of time-to-event models. A key requirement of the IPCW Brier score is the estimation of the censoring distribution, which is needed to compute the weights. The established paradigm of splitting a dataset into a training and a test set for model fitting and evaluation raises the question which of these datasets to use in order to fit the censoring model. There seems to be considerable disagreement between authors with regards to this issue, and no standard has been established so far. To shed light on this important question, we conducted a comprehensive experimental study exploring various data scenarios and estimation schemes. We found that it is generally of little importance which dataset is used to model the censoring distribution. However, in some circumstances, such as in the case of a covariate-dependent censoring process, a small sample size, or when dealing with noisy data, it may be advisable to use the test set instead of the training set to model the censoring distribution. A detailed set of practical recommendations concludes our paper.

References

Gerds T, Schumacher M . Efron-type measures of prediction error for survival analysis. Biometrics. 2007; 63(4):1283-7. DOI: 10.1111/j.1541-0420.2007.00832.x. View

Friedman J, Hastie T, Tibshirani R . Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw. 2010; 33(1):1-22. PMC: 2929880. View

Jiang W, Sun H, Peng Y . Prediction accuracy for the cure probabilities in mixture cure models. Stat Methods Med Res. 2017; 26(5):2029-2041. DOI: 10.1177/0962280217708673. View

Uno H, Cai T, Pencina M, DAgostino R, Wei L . On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat Med. 2011; 30(10):1105-17. PMC: 3079915. DOI: 10.1002/sim.4154. View

Sonabend R, Kiraly F, Bender A, Bischl B, Lang M . mlr3proba: an R package for machine learning in survival analysis. Bioinformatics. 2021; 37(17):2789-2791. PMC: 8428574. DOI: 10.1093/bioinformatics/btab039. View

Alba A, Agoritsas T, Walsh M, Hanna S, Iorio A, Devereaux P . Discrimination and Calibration of Clinical Prediction Models: Users' Guides to the Medical Literature. JAMA. 2017; 318(14):1377-1384. DOI: 10.1001/jama.2017.12126. View

Kattan M, Gerds T . The index of prediction accuracy: an intuitive measure useful for evaluating risk prediction models. Diagn Progn Res. 2019; 2:7. PMC: 6460739. DOI: 10.1186/s41512-018-0029-2. View

Moons K, Altman D, Reitsma J, Ioannidis J, Macaskill P, Steyerberg E . Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med. 2015; 162(1):W1-73. DOI: 10.7326/M14-0698. View

Robins J, Finkelstein D . Correcting for noncompliance and dependent censoring in an AIDS Clinical Trial with inverse probability of censoring weighted (IPCW) log-rank tests. Biometrics. 2000; 56(3):779-88. DOI: 10.1111/j.0006-341x.2000.00779.x. View

10.

Graf E, Schmoor C, Sauerbrei W, Schumacher M . Assessment and comparison of prognostic classification schemes for survival data. Stat Med. 1999; 18(17-18):2529-45. DOI: 10.1002/(sici)1097-0258(19990915/30)18:17/18<2529::aid-sim274>3.0.co;2-5. View

11.

Austin P, Harrell Jr F, Steyerberg E . Predictive performance of machine and statistical learning methods: Impact of data-generating processes on external validity in the "large N, small p" setting. Stat Methods Med Res. 2021; 30(6):1465-1483. PMC: 8188999. DOI: 10.1177/09622802211002867. View

12.

Cook N . Statistical evaluation of prognostic versus diagnostic models: beyond the ROC curve. Clin Chem. 2007; 54(1):17-23. DOI: 10.1373/clinchem.2007.096529. View

13.

Yang W, Jiang J, Schnellinger E, Kimmel S, Guo W . Modified Brier score for evaluating prediction accuracy for binary outcomes. Stat Methods Med Res. 2022; 31(12):2287-2296. PMC: 9691523. DOI: 10.1177/09622802221122391. View

14.

Metten M, Costet N, Multigner L, Viel J, Chauvet G . Inverse probability weighting to handle attrition in cohort studies: some guidance and a call for caution. BMC Med Res Methodol. 2022; 22(1):45. PMC: 8848672. DOI: 10.1186/s12874-022-01533-9. View

15.

Gerds T, Schumacher M . Consistent estimation of the expected Brier score in general survival models with right-censored event times. Biom J. 2007; 48(6):1029-40. DOI: 10.1002/bimj.200610301. View

16.

Seaman S, White I . Review of inverse probability weighting for dealing with missing data. Stat Methods Med Res. 2011; 22(3):278-95. DOI: 10.1177/0962280210395740. View

17.

Harrell Jr F, Califf R, Pryor D, Lee K, Rosati R . Evaluating the yield of medical tests. JAMA. 1982; 247(18):2543-6. View

18.

Pate A, Riley R, Collins G, van Smeden M, Van Calster B, Ensor J . Minimum sample size for developing a multivariable prediction model using multinomial logistic regression. Stat Methods Med Res. 2023; 32(3):555-571. PMC: 10012398. DOI: 10.1177/09622802231151220. View

19.

Austin P . The use of propensity score methods with survival or time-to-event outcomes: reporting measures of effect similar to those used in randomized experiments. Stat Med. 2013; 33(7):1242-58. PMC: 4285179. DOI: 10.1002/sim.5984. View

20.

Liu D, Schilling B, Liu D, Sucker A, Livingstone E, Jerby-Arnon L . Integrative molecular and clinical modeling of clinical outcomes to PD1 blockade in patients with metastatic melanoma. Nat Med. 2019; 25(12):1916-1927. PMC: 6898788. DOI: 10.1038/s41591-019-0654-5. View