Deep Reinforcement Learning for Dynamic Treatment Regimes on Medical Registry Data

Overview

Journal Healthc Inform

Specialties Health Services
Medical Informatics

Date 2018 Mar 21

PMID 29556119

Citations 17

Authors

Ying Liu

Brent Logan

Ning Liu

Zhiyuan Xu

Jian Tang

Yanzhi Wang

Affiliations

Soon will be listed here.

Abstract

In this paper, we propose the first deep reinforcement learning framework to estimate the optimal Dynamic Treatment Regimes from observational medical data. This framework is more flexible and adaptive for high dimensional action and state spaces than existing reinforcement learning methods to model real life complexity in heterogeneous disease progression and treatment choices, with the goal to provide doctor and patients the data-driven personalized decision recommendations. The proposed deep reinforcement learning framework contains a supervised learning step to predict the most possible expert actions; and a deep reinforcement learning step to estimate the long term value function of Dynamic Treatment Regimes. We motivated and implemented the proposed framework on a data set from the Center for International Bone Marrow Transplant Research (CIBMTR) registry database, focusing on the sequence of prevention and treatments for acute and chronic graft versus host disease. We showed results of the initial implementation that demonstrates promising accuracy in predicting human expert decisions and initial implementation for the reinforcement learning step.

Citing Articles

Artificial Intelligence in Audiology: A Scoping Review of Current Applications and Future Directions.

Frosolini A, Franz L, Caragli V, Genovese E, de Filippis C, Marioni G Sensors (Basel). 2024; 24(22).

PMID: 39598904 PMC: 11598364. DOI: 10.3390/s24227126.

Enhancing Medical Training Through Learning From Mistakes by Interacting With an Ill-Trained Reinforcement Learning Agent.

Kakdas Y, Kockara S, Halic T, Demirel D IEEE Trans Learn Technol. 2024; 17:1248-1260.

PMID: 39431279 PMC: 11486497. DOI: 10.1109/tlt.2024.3372508.

PrescDRL: deep reinforcement learning for herbal prescription planning in treatment of chronic diseases.

Yang K, Yu Z, Su X, Zhang F, He X, Wang N Chin Med. 2024; 19(1):144.

PMID: 39415223 PMC: 11481742. DOI: 10.1186/s13020-024-01005-w.

Mathematical Model-Driven Deep Learning Enables Personalized Adaptive Therapy.

Gallagher K, Strobl M, Park D, Spoendlin F, Gatenby R, Maini P Cancer Res. 2024; 84(11):1929-1941.

PMID: 38569183 PMC: 11148552. DOI: 10.1158/0008-5472.CAN-23-2040.

Optimizing warfarin dosing for patients with atrial fibrillation using machine learning.

Petch J, Nelson W, Wu M, Ghassemi M, Benz A, Fatemi M Sci Rep. 2024; 14(1):4516.

PMID: 38402362 PMC: 10894214. DOI: 10.1038/s41598-024-55110-9.

References

Silver D, Huang A, Maddison C, Guez A, Sifre L, Van Den Driessche G . Mastering the game of Go with deep neural networks and tree search. Nature. 2016; 529(7587):484-9. DOI: 10.1038/nature16961. View

Murphy S . An experimental design for the development of adaptive treatment strategies. Stat Med. 2004; 24(10):1455-81. DOI: 10.1002/sim.2022. View

Murphy S, Oslin D, Rush A, Zhu J . Methodological challenges in constructing effective treatment sequences for chronic psychiatric disorders. Neuropsychopharmacology. 2006; 32(2):257-62. DOI: 10.1038/sj.npp.1301241. View

Geng Y, Zhang H, Lu W . On optimal treatment regimes selection for mean survival time. Stat Med. 2014; 34(7):1169-84. PMC: 4355217. DOI: 10.1002/sim.6397. View

Zhao Y, Zeng D, Laber E, Kosorok M . New Statistical Learning Methods for Estimating Optimal Dynamic Treatment Regimes. J Am Stat Assoc. 2015; 110(510):583-598. PMC: 4517946. DOI: 10.1080/01621459.2014.937488. View

Zhao Y, Kosorok M, Zeng D . Reinforcement learning design for cancer clinical trials. Stat Med. 2009; 28(26):3294-315. PMC: 2767418. DOI: 10.1002/sim.3720. View

Rush A, Fava M, Wisniewski S, Lavori P, Trivedi M, Sackeim H . Sequenced treatment alternatives to relieve depression (STAR*D): rationale and design. Control Clin Trials. 2004; 25(1):119-42. DOI: 10.1016/s0197-2456(03)00112-0. View

Qian M, Murphy S . PERFORMANCE GUARANTEES FOR INDIVIDUALIZED TREATMENT RULES. Ann Stat. 2011; 39(2):1180-1210. PMC: 3110016. DOI: 10.1214/10-AOS864. View

Lavori P, Dawson R . Dynamic treatment regimes: practical design considerations. Clin Trials. 2005; 1(1):9-20. DOI: 10.1191/1740774s04cn002oa. View

10.

Hinton G, Salakhutdinov R . Reducing the dimensionality of data with neural networks. Science. 2006; 313(5786):504-7. DOI: 10.1126/science.1127647. View

11.

Esteva A, Kuprel B, Novoa R, Ko J, Swetter S, Blau H . Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017; 542(7639):115-118. PMC: 8382232. DOI: 10.1038/nature21056. View

12.

Zhang B, Tsiatis A, Laber E, Davidian M . A robust method for estimating optimal treatment regimes. Biometrics. 2012; 68(4):1010-8. PMC: 3556998. DOI: 10.1111/j.1541-0420.2012.01763.x. View

13.

Wang Y, Wu P, Liu Y, Weng C, Zeng D . Learning Optimal Individualized Treatment Rules from Electronic Health Record Data. Proc (IEEE Int Conf Healthc Inform). 2017; 2016:65-71. PMC: 5423731. DOI: 10.1109/ICHI.2016.13. View

14.

Moodie E, Richardson T, Stephens D . Demystifying optimal dynamic treatment regimes. Biometrics. 2007; 63(2):447-55. DOI: 10.1111/j.1541-0420.2006.00686.x. View

15.

Krakow E, Hemmer M, Wang T, Logan B, Arora M, Spellman S . Tools for the Precision Medicine Era: How to Develop Highly Personalized Treatment Recommendations From Cohort and Registry Data Using Q-Learning. Am J Epidemiol. 2017; 186(2):160-172. PMC: 6664807. DOI: 10.1093/aje/kwx027. View

16.

Zhao Y, Zeng D, Rush A, Kosorok M . Estimating Individualized Treatment Rules Using Outcome Weighted Learning. J Am Stat Assoc. 2013; 107(449):1106-1118. PMC: 3636816. DOI: 10.1080/01621459.2012.695674. View

17.

Mnih V, Kavukcuoglu K, Silver D, Rusu A, Veness J, Bellemare M . Human-level control through deep reinforcement learning. Nature. 2015; 518(7540):529-33. DOI: 10.1038/nature14236. View