» Articles » PMID: 39723811

Enhancing Patient Representation Learning with Inferred Family Pedigrees Improves Disease Risk Prediction

Overview
Date 2024 Dec 26
PMID 39723811
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Machine learning and deep learning are powerful tools for analyzing electronic health records (EHRs) in healthcare research. Although family health history has been recognized as a major predictor for a wide spectrum of diseases, research has so far adopted a limited view of family relations, essentially treating patients as independent samples in the analysis.

Methods: To address this gap, we present ALIGATEHR, which models inferred family relations in a graph attention network augmented with an attention-based medical ontology representation, thus accounting for the complex influence of genetics, shared environmental exposures, and disease dependencies.

Results: Taking disease risk prediction as a use case, we demonstrate that explicitly modeling family relations significantly improves predictions across the disease spectrum. We then show how ALIGATEHR's attention mechanism, which links patients' disease risk to their relatives' clinical profiles, successfully captures genetic aspects of diseases using longitudinal EHR diagnosis data. Finally, we use ALIGATEHR to successfully distinguish the 2 main inflammatory bowel disease subtypes with highly shared risk factors and symptoms (Crohn's disease and ulcerative colitis).

Conclusion: Overall, our results highlight that family relations should not be overlooked in EHR research and illustrate ALIGATEHR's great potential for enhancing patient representation learning for predictive and interpretable modeling of EHRs.

Citing Articles

Beyond the individual.

Bakken S J Am Med Inform Assoc. 2025; 32(3):415-416.

PMID: 39963970 PMC: 11833475. DOI: 10.1093/jamia/ocaf020.

References
1.
Ginsburg G, Wu R, Orlando L . Family health history: underused for actionable risk assessment. Lancet. 2019; 394(10198):596-603. PMC: 6822265. DOI: 10.1016/S0140-6736(19)31275-9. View

2.
Orlando L, Wu R, Myers R, Neuner J, McCarty C, Haller I . At the intersection of precision medicine and population health: an implementation-effectiveness study of family health history based systematic risk assessment in primary care. BMC Health Serv Res. 2020; 20(1):1015. PMC: 7648301. DOI: 10.1186/s12913-020-05868-1. View

3.
Choi E, Bahadori M, Song L, Stewart W, Sun J . GRAM: Graph-based Attention Model for Healthcare Representation Learning. KDD. 2021; 2017:787-795. PMC: 7954122. DOI: 10.1145/3097983.3098126. View

4.
Oh W, Steinbach M, Castro M, Peterson K, Kumar V, Caraballo P . A Computational Method for Learning Disease Trajectories From Partially Observable EHR Data. IEEE J Biomed Health Inform. 2021; 25(7):2476-2486. PMC: 8388183. DOI: 10.1109/JBHI.2021.3089441. View

5.
Negro-Calduch E, Azzopardi-Muscat N, Krishnamurthy R, Novillo-Ortiz D . Technological progress in electronic health record system optimization: Systematic review of systematic literature reviews. Int J Med Inform. 2021; 152:104507. PMC: 8223493. DOI: 10.1016/j.ijmedinf.2021.104507. View