» Articles » PMID: 26104741

Design and Implementation of a Privacy Preserving Electronic Health Record Linkage Tool in Chicago

Abstract

Objective: To design and implement a tool that creates a secure, privacy preserving linkage of electronic health record (EHR) data across multiple sites in a large metropolitan area in the United States (Chicago, IL), for use in clinical research.

Methods: The authors developed and distributed a software application that performs standardized data cleaning, preprocessing, and hashing of patient identifiers to remove all protected health information. The application creates seeded hash code combinations of patient identifiers using a Health Insurance Portability and Accountability Act compliant SHA-512 algorithm that minimizes re-identification risk. The authors subsequently linked individual records using a central honest broker with an algorithm that assigns weights to hash combinations in order to generate high specificity matches.

Results: The software application successfully linked and de-duplicated 7 million records across 6 institutions, resulting in a cohort of 5 million unique records. Using a manually reconciled set of 11 292 patients as a gold standard, the software achieved a sensitivity of 96% and a specificity of 100%, with a majority of the missed matches accounted for by patients with both a missing social security number and last name change. Using 3 disease examples, it is demonstrated that the software can reduce duplication of patient records across sites by as much as 28%.

Conclusions: Software that standardizes the assignment of a unique seeded hash identifier merged through an agreed upon third-party honest broker can enable large-scale secure linkage of EHR data for epidemiologic and public health research. The software algorithm can improve future epidemiologic research by providing more comprehensive data given that patients may make use of multiple healthcare systems.

Citing Articles

Data linkage multiplies research insights across diverse healthcare sectors.

Eisinger-Mathason T, Leshin J, Lahoti V, Fridsma D, Mucaj V, Kho A Commun Med (Lond). 2025; 5(1):58.

PMID: 40038513 PMC: 11880312. DOI: 10.1038/s43856-025-00769-y.


Accuracy of privacy preserving record linkage for real world data in the United States: a systemic review.

Tyagi K, Willis S JAMIA Open. 2025; 8(1):ooaf002.

PMID: 39845287 PMC: 11752849. DOI: 10.1093/jamiaopen/ooaf002.


Privacy preserving record linkage for public health action: opportunities and challenges.

Pathak A, Serrer L, Zapata D, King R, Mirel L, Sukalac T J Am Med Inform Assoc. 2024; 31(11):2605-2612.

PMID: 39047294 PMC: 11491627. DOI: 10.1093/jamia/ocae196.


Linking Patient Encounters across Primary and Ancillary Electronic Health Record Systems: A Comparison of Two Approaches.

Davila 3rd M, Sholle E, Fuld X, Israel M, Cole C, Campion Jr T ACI open. 2024; 8(1):e43-e48.

PMID: 38765555 PMC: 11101195. DOI: 10.1055/s-0044-1782679.


Case study on communicating with research ethics committees about minimizing risk through software: an application for record linkage in secondary data analysis.

Schmit C, Ferdinand A, Giannouchos T, Kum H JAMIA Open. 2024; 7(1):ooae010.

PMID: 38425705 PMC: 10903982. DOI: 10.1093/jamiaopen/ooae010.


References
1.
Holve E, Segal C, Hamilton Lopez M, Rein A, Johnson B . The Electronic Data Methods (EDM) forum for comparative effectiveness research (CER). Med Care. 2012; 50 Suppl:S7-10. DOI: 10.1097/MLR.0b013e318257a66b. View

2.
Weber S, Lowe H, Das A, Ferris T . A simple heuristic for blindfolded record linkage. J Am Med Inform Assoc. 2012; 19(e1):e157-61. PMC: 3392854. DOI: 10.1136/amiajnl-2011-000329. View

3.
Arellano M, Weber G . Issues in identification and linkage of patient records across an integrated delivery system. J Healthc Inf Manag. 1999; 12(3):43-52. View

4.
Pulley J, Clayton E, Bernard G, Roden D, Masys D . Principles of human subjects protections applied in an opt-out, de-identified biobank. Clin Transl Sci. 2010; 3(1):42-8. PMC: 3075971. DOI: 10.1111/j.1752-8062.2010.00175.x. View

5.
Quantin C, Bouzelat H, Allaert F, Benhamiche A, Faivre J, DUSSERRE L . How to ensure data security of an epidemiological follow-up: quality assessment of an anonymous record linkage procedure. Int J Med Inform. 1998; 49(1):117-22. DOI: 10.1016/s1386-5056(98)00019-7. View