» Articles » PMID: 34490942

Step-adjusted Tree-based Reinforcement Learning for Evaluating Nested Dynamic Treatment Regimes Using Test-and-treat Observational Data

Overview
Journal Stat Med
Publisher Wiley
Specialty Public Health
Date 2021 Sep 7
PMID 34490942
Citations 1
Authors
Affiliations
Soon will be listed here.
Abstract

Dynamic treatment regimes (DTRs) include a sequence of treatment decision rules, in which treatment is adapted over time in response to the changes in an individual's disease progression and health care history. In medical practice, nested test-and-treat strategies are common to improve cost-effectiveness. For example, for patients at risk of prostate cancer, only patients who have high prostate-specific antigen (PSA) need a biopsy, which is costly and invasive, to confirm the diagnosis and help determine the treatment if needed. A decision about treatment happens after the biopsy, and is thus nested within the decision of whether to do the test. However, current existing statistical methods are not able to accommodate such a naturally embedded property of the treatment decision within the test decision. Therefore, we developed a new statistical learning method, step-adjusted tree-based reinforcement learning, to evaluate DTRs within such a nested multistage dynamic decision framework using observational data. At each step within each stage, we combined the robust semiparametric estimation via augmented inverse probability weighting with a tree-based reinforcement learning method to deal with the counterfactual optimization. The simulation studies demonstrated robust performance of the proposed methods under different scenarios. We further applied our method to evaluate the necessity of prostate biopsy and identify the optimal test-and-treat regimes for prostate cancer patients using data from the Johns Hopkins University prostate cancer active surveillance dataset.

Citing Articles

Energy landscape analysis and time-series clustering analysis of patient state multistability related to rheumatoid arthritis drug treatment: The KURAMA cohort study.

Yamamoto K, Sakaguchi M, Onishi A, Yokoyama S, Matsui Y, Yamamoto W PLoS One. 2024; 19(5):e0302308.

PMID: 38709812 PMC: 11073743. DOI: 10.1371/journal.pone.0302308.

References
1.
Eckermann S, Willan A . Expected value of information and decision making in HTA. Health Econ. 2006; 16(2):195-209. DOI: 10.1002/hec.1161. View

2.
Murphy S, van der Laan M, Robins J . Marginal Mean Models for Dynamic Regimes. J Am Stat Assoc. 2009; 96(456):1410-1423. PMC: 2794446. DOI: 10.1198/016214501753382327. View

3.
Loeb S, Bjurlin M, Nicholson J, Tammela T, Penson D, Carter H . Overdiagnosis and overtreatment of prostate cancer. Eur Urol. 2014; 65(6):1046-55. PMC: 4113338. DOI: 10.1016/j.eururo.2013.12.062. View

4.
Pezaro C, Woo H, Davis I . Prostate cancer: measuring PSA. Intern Med J. 2014; 44(5):433-40. DOI: 10.1111/imj.12407. View

5.
Tao Y, Wang L . Adaptive contrast weighted learning for multi-stage multi-treatment decision-making. Biometrics. 2016; 73(1):145-155. DOI: 10.1111/biom.12539. View