» Articles » PMID: 38783412

OpenSAFELY: A Platform for Analysing Electronic Health Records Designed for Reproducible Research

Abstract

Electronic health records (EHRs) and other administrative health data are increasingly used in research to generate evidence on the effectiveness, safety, and utilisation of medical products and services, and to inform public health guidance and policy. Reproducibility is a fundamental step for research credibility and promotes trust in evidence generated from EHRs. At present, ensuring research using EHRs is reproducible can be challenging for researchers. Research software platforms can provide technical solutions to enhance the reproducibility of research conducted using EHRs. In response to the COVID-19 pandemic, we developed the secure, transparent, analytic open-source software platform OpenSAFELY designed with reproducible research in mind. OpenSAFELY mitigates common barriers to reproducible research by: standardising key workflows around data preparation; removing barriers to code-sharing in secure analysis environments; enforcing public sharing of programming code and codelists; ensuring the same computational environment is used everywhere; integrating new and existing tools that encourage and enable the use of reproducible working practices; and providing an audit trail for all code that is run against the real data to increase transparency. This paper describes OpenSAFELY's reproducibility-by-design approach in detail.

Citing Articles

The Applications of Machine Learning in the Management of Patients Undergoing Stem Cell Transplantation: Are We Ready?.

Garuffo L, Leoni A, Gatta R, Bernardi S Cancers (Basel). 2025; 17(3).

PMID: 39941764 PMC: 11816169. DOI: 10.3390/cancers17030395.


Changes in opioid prescribing during the COVID-19 pandemic in England: an interrupted time-series analysis in the OpenSAFELY-TTP cohort.

Schaffer A, Andrews C, Brown A, Croker R, Hulme W, Nab L Lancet Public Health. 2024; 9(7):e432-e442.

PMID: 38942555 PMC: 7616651. DOI: 10.1016/S2468-2667(24)00100-2.

References
1.
Springate D, Kontopantelis E, Ashcroft D, Olier I, Parisi R, Chamapiwa E . ClinicalCodes: an online clinical codes repository to improve the validity and reproducibility of research using electronic medical records. PLoS One. 2014; 9(6):e99825. PMC: 4062485. DOI: 10.1371/journal.pone.0099825. View

2.
Gourd E . GDPR obstructs cancer research data sharing. Lancet Oncol. 2021; 22(5):592. DOI: 10.1016/S1470-2045(21)00207-2. View

3.
Shepherd B, Peratikos M, Rebeiro P, Duda S, McGowan C . A Pragmatic Approach for Reproducible Research With Sensitive Data. Am J Epidemiol. 2017; 186(4):387-392. PMC: 5860260. DOI: 10.1093/aje/kwx066. View

4.
Orsini L, Berger M, Crown W, Daniel G, Eichler H, Goettsch W . Improving Transparency to Build Trust in Real-World Secondary Data Studies for Hypothesis Testing-Why, What, and How: Recommendations and a Road Map from the Real-World Evidence Transparency Initiative. Value Health. 2020; 23(9):1128-1136. DOI: 10.1016/j.jval.2020.04.002. View

5.
Wang S, Schneeweiss S, Berger M, Brown J, de Vries F, Douglas I . Reporting to Improve Reproducibility and Facilitate Validity Assessment for Healthcare Database Studies V1.0. Pharmacoepidemiol Drug Saf. 2017; 26(9):1018-1032. PMC: 5639362. DOI: 10.1002/pds.4295. View