» Articles » PMID: 39314265

TARGETING UNDERREPRESENTED POPULATIONS IN PRECISION MEDICINE: A FEDERATED TRANSFER LEARNING APPROACH

Overview
Journal Ann Appl Stat
Date 2024 Sep 24
PMID 39314265
Authors
Affiliations
Soon will be listed here.
Abstract

The limited representation of minorities and disadvantaged populations in large-scale clinical and genomics research poses a significant barrier to translating precision medicine research into practice. Prediction models are likely to underperform in underrepresented populations due to heterogeneity across populations, thereby exacerbating known health disparities. To address this issue, we propose FETA, a two-way data integration method that leverages a federated transfer learning approach to integrate heterogeneous data from diverse populations and multiple healthcare institutions, with a focus on a target population of interest having limited sample sizes. We show that FETA achieves performance comparable to the pooled analysis, where individual-level data is shared across institutions, with only a small number of communications across participating sites. Our theoretical analysis and simulation study demonstrate how FETA's estimation accuracy is influenced by communication budgets, privacy restrictions, and heterogeneity across populations. We apply FETA to multisite data from the electronic Medical Records and Genomics (eMERGE) Network to construct genetic risk prediction models for extreme obesity. Compared to models trained using target data only, source data only, and all data without accounting for population-level differences, FETA shows superior predictive performance. FETA has the potential to improve estimation and prediction accuracy in underrepresented populations and reduce the gap in model performance across populations.

Citing Articles

Polygenic prediction for underrepresented populations through transfer learning by utilizing genetic similarity shared with European populations.

Zhu Y, Chen W, Zhu K, Liu Y, Huang S, Zeng P Brief Bioinform. 2025; 26(1).

PMID: 39905953 PMC: 11794457. DOI: 10.1093/bib/bbaf048.


A robust transfer learning approach for high-dimensional linear regression to support integration of multi-source gene expression data.

Pan L, Gao Q, Wei K, Yu Y, Qin G, Wang T PLoS Comput Biol. 2025; 21(1):e1012739.

PMID: 39792955 PMC: 11756795. DOI: 10.1371/journal.pcbi.1012739.


Multi-Task Learning with Summary Statistics.

Knight P, Duan R Adv Neural Inf Process Syst. 2024; 36:54020-54031.

PMID: 39351341 PMC: 11440483.


TARGETING UNDERREPRESENTED POPULATIONS IN PRECISION MEDICINE: A FEDERATED TRANSFER LEARNING APPROACH.

Li B, Cai T, Duan R Ann Appl Stat. 2024; 17(4):2970-2992.

PMID: 39314265 PMC: 11417462. DOI: 10.1214/23-AOAS1747.


Multi-Source Conformal Inference Under Distribution Shift.

Liu Y, Levis A, Normand S, Han L Proc Mach Learn Res. 2024; 235:31344-31382.

PMID: 39193374 PMC: 11345809.


References
1.
van der Haak M, Wolff A, Brandner R, Drings P, Wannenmacher M, Wetter T . Data security and protection in cross-institutional electronic patient records. Int J Med Inform. 2003; 70(2-3):117-30. DOI: 10.1016/s1386-5056(03)00033-9. View

2.
Kraft S, Cho M, Gillespie K, Halley M, Varsava N, Ormond K . Beyond Consent: Building Trusting Relationships With Diverse Populations in Precision Medicine Research. Am J Bioeth. 2018; 18(4):3-20. PMC: 6173191. DOI: 10.1080/15265161.2018.1431322. View

3.
Collins F, Varmus H . A new initiative on precision medicine. N Engl J Med. 2015; 372(9):793-5. PMC: 5101938. DOI: 10.1056/NEJMp1500523. View

4.
Tian Y, Feng Y . Transfer Learning under High-dimensional Generalized Linear Models. J Am Stat Assoc. 2024; 118(544):2684-2697. PMC: 10982637. DOI: 10.1080/01621459.2022.2071278. View

5.
West K, Blacksher E, Burke W . Genomics, Health Disparities, and Missed Opportunities for the Nation's Research Agenda. JAMA. 2017; 317(18):1831-1832. PMC: 5636000. DOI: 10.1001/jama.2017.3096. View