» Articles » PMID: 37759344

A Cautionary Note on Using Propensity Score Calibration to Control for Unmeasured Confounding Bias When the Surrogacy Assumption Is Absent

Overview
Journal Am J Epidemiol
Specialty Public Health
Date 2023 Sep 27
PMID 37759344
Authors
Affiliations
Soon will be listed here.
Abstract

Conventional propensity score methods encounter challenges when unmeasured confounding is present, as it becomes impossible to accurately estimate the gold-standard propensity score when data on certain confounders are unavailable. Propensity score calibration (PSC) addresses this issue by constructing a surrogate for the gold-standard propensity score under the surrogacy assumption. This assumption posits that the error-prone propensity score, based on observed confounders, is independent of the outcome when conditioned on the gold-standard propensity score and the exposure. However, this assumption implies that confounders cannot directly impact the outcome and that their effects on the outcome are solely mediated through the propensity score. This raises concerns regarding the applicability of PSC in practical settings where confounders can directly affect the outcome. While PSC aims to target a conditional treatment effect by conditioning on a subject's unobservable propensity score, the causal interest in the latter case lies in a conditional treatment effect conditioned on a subject's baseline characteristics. Our analysis reveals that PSC is generally biased unless the effects of confounders on the outcome and treatment are proportional to each other. Furthermore, we identify 2 sources of bias: 1) the noncollapsibility of effect measures, such as the odds ratio or hazard ratio and 2) residual confounding, as the calibrated propensity score may not possess the properties of a valid propensity score.

Citing Articles

Propensity Score Matching: should we use it in designing observational studies?.

Wan F BMC Med Res Methodol. 2025; 25(1):25.

PMID: 39875810 PMC: 11776168. DOI: 10.1186/s12874-025-02481-w.

References
1.
Wan F, Small D, Bekelman J, Mitra N . Bias in estimating the causal hazard ratio when using two-stage instrumental variable methods. Stat Med. 2015; 34(14):2235-65. PMC: 4455906. DOI: 10.1002/sim.6470. View

2.
Whitlock E, Kim H, Auerbach A . Harms associated with single unit perioperative transfusion: retrospective population based analysis. BMJ. 2015; 350:h3037. PMC: 4463965. DOI: 10.1136/bmj.h3037. View

3.
Janes H, Dominici F, Zeger S . On quantifying the magnitude of confounding. Biostatistics. 2010; 11(3):572-82. PMC: 2883302. DOI: 10.1093/biostatistics/kxq007. View

4.
Sturmer T, Schneeweiss S, Rothman K, Avorn J, Glynn R . Performance of propensity score calibration--a simulation study. Am J Epidemiol. 2007; 165(10):1110-8. PMC: 1945235. DOI: 10.1093/aje/kwm074. View

5.
Baiocchi M, Small D, Yang L, Polsky D, Groeneveld P . Near/far matching: a study design approach to instrumental variables. Health Serv Outcomes Res Methodol. 2016; 12(4):237-253. PMC: 4831129. DOI: 10.1007/s10742-012-0091-0. View