» Articles » PMID: 37813960

Harnessing the Power of Synthetic Data in Healthcare: Innovation, Application, and Privacy

Overview
Journal NPJ Digit Med
Date 2023 Oct 9
PMID 37813960
Authors
Affiliations
Soon will be listed here.
Abstract

Data-driven decision-making in modern healthcare underpins innovation and predictive analytics in public health and clinical research. Synthetic data has shown promise in finance and economics to improve risk assessment, portfolio optimization, and algorithmic trading. However, higher stakes, potential liabilities, and healthcare practitioner distrust make clinical use of synthetic data difficult. This paper explores the potential benefits and limitations of synthetic data in the healthcare analytics context. We begin with real-world healthcare applications of synthetic data that informs government policy, enhance data privacy, and augment datasets for predictive analytics. We then preview future applications of synthetic data in the emergent field of digital twin technology. We explore the issues of data quality and data bias in synthetic data, which can limit applicability across different applications in the clinical context, and privacy concerns stemming from data misuse and risk of re-identification. Finally, we evaluate the role of regulatory agencies in promoting transparency and accountability and propose strategies for risk mitigation such as Differential Privacy (DP) and a dataset chain of custody to maintain data integrity, traceability, and accountability. Synthetic data can improve healthcare, but measures to protect patient well-being and maintain ethical standards are key to promote responsible use.

Citing Articles

Cities, communities and clinics can be testbeds for human exposome and aging research.

Woods T, Palmarini N, Corner L, Barzilai N, Maier A, Sagner M Nat Med. 2025; .

PMID: 40075229 DOI: 10.1038/s41591-025-03519-8.


Deep learning-based image analysis in muscle histopathology using photo-realistic synthetic data.

Mill L, Aust O, Ackermann J, Burger P, Pascual M, Palumbo-Zerr K Commun Med (Lond). 2025; 5(1):64.

PMID: 40050400 PMC: 11885816. DOI: 10.1038/s43856-025-00777-y.


Externally validated and clinically useful machine learning algorithms to support patient-related decision-making in oncology: a scoping review.

Santos C, Amorim-Lopes M BMC Med Res Methodol. 2025; 25(1):45.

PMID: 39984835 PMC: 11843972. DOI: 10.1186/s12874-025-02463-y.


How good is your synthetic data? SynthRO, a dashboard to evaluate and benchmark synthetic tabular data.

Santangelo G, Nicora G, Bellazzi R, Dagliati A BMC Med Inform Decis Mak. 2025; 25(1):89.

PMID: 39966793 PMC: 11837667. DOI: 10.1186/s12911-024-02731-9.


FUTURE-AI: international consensus guideline for trustworthy and deployable artificial intelligence in healthcare.

Lekadir K, Frangi A, Porras A, Glocker B, Cintas C, Langlotz C BMJ. 2025; 388:e081554.

PMID: 39909534 PMC: 11795397. DOI: 10.1136/bmj-2024-081554.


References
1.
Brauneck A, Schmalhorst L, Kazemi Majdabadi M, Bakhtiari M, Volker U, Baumbach J . Federated Machine Learning, Privacy-Enhancing Technologies, and Data Protection Laws in Medical Research: Scoping Review. J Med Internet Res. 2023; 25:e41588. PMC: 10131784. DOI: 10.2196/41588. View

2.
Zhang J, Qian H, Zhou H . [Application and Research of Digital Twin Technology in Safety and Health Monitoring of the Elderly in Community]. Zhongguo Yi Liao Qi Xie Za Zhi. 2019; 43(6):410-413. DOI: 10.3969/j.issn.1671-7104.2019.06.005. View

3.
Loong B, Zaslavsky A, He Y, Harrington D . Disclosure control using partially synthetic data for large-scale health surveys, with applications to CanCORS. Stat Med. 2013; 32(24):4139-61. PMC: 3869901. DOI: 10.1002/sim.5841. View

4.
Ive J, Viani N, Kam J, Yin L, Verma S, Puntis S . Generation and evaluation of artificial mental health records for Natural Language Processing. NPJ Digit Med. 2020; 3:69. PMC: 7224173. DOI: 10.1038/s41746-020-0267-x. View

5.
Harron K, Gilbert R, Cromwell D, van der Meulen J . Linking Data for Mothers and Babies in De-Identified Electronic Health Data. PLoS One. 2016; 11(10):e0164667. PMC: 5072610. DOI: 10.1371/journal.pone.0164667. View