» Articles » PMID: 34209487

Machine Learning Prediction of Biomarkers from SNPs and of Disease Risk from Biomarkers in the UK Biobank

Overview
Journal Genes (Basel)
Publisher MDPI
Date 2021 Jul 2
PMID 34209487
Citations 6
Authors
Affiliations
Soon will be listed here.
Abstract

We use UK Biobank data to train predictors for 65 blood and urine markers such as HDL, LDL, lipoprotein A, glycated haemoglobin, etc. from SNP genotype. For example, our Polygenic Score (PGS) predictor correlates ∼0.76 with lipoprotein A level, which is highly heritable and an independent risk factor for heart disease. This may be the most accurate genomic prediction of a quantitative trait that has yet been produced (specifically, for European ancestry groups). We also train predictors of common disease risk using blood and urine biomarkers alone (no DNA information); we call these predictors biomarker risk scores, BMRS. Individuals who are at high risk (e.g., odds ratio of >5× population average) can be identified for conditions such as coronary artery disease (AUC∼0.75), diabetes (AUC∼0.95), hypertension, liver and kidney problems, and cancer using biomarkers alone. Our atherosclerotic cardiovascular disease (ASCVD) predictor uses ∼10 biomarkers and performs in UKB evaluation as well as or better than the American College of Cardiology ASCVD Risk Estimator, which uses quite different inputs (age, diagnostic history, BMI, smoking status, statin usage, etc.). We compare polygenic risk scores (risk conditional on genotype: PRS) for common diseases to the risk predictors which result from the concatenation of learned functions BMRS and PGS, i.e., applying the BMRS predictors to the PGS output.

Citing Articles

Cardiovascular disease prediction model based on patient behavior patterns in the context of deep learning: a time-series data analysis perspective.

Wang Y, Rao C, Cheng Q, Yang J Front Psychiatry. 2024; 15:1418969.

PMID: 39676910 PMC: 11640863. DOI: 10.3389/fpsyt.2024.1418969.


Advancing healthcare: the role and impact of AI and foundation models.

Mahesh N, Devishamani C, Raghu K, Mahalingam M, Bysani P, Chakravarthy A Am J Transl Res. 2024; 16(6):2166-2179.

PMID: 39006256 PMC: 11236664. DOI: 10.62347/WQWV9220.


Revolutionizing healthcare: the role of artificial intelligence in clinical practice.

Alowais S, Alghamdi S, Alsuhebany N, Alqahtani T, Alshaya A, Almohareb S BMC Med Educ. 2023; 23(1):689.

PMID: 37740191 PMC: 10517477. DOI: 10.1186/s12909-023-04698-z.


Biobank-scale methods and projections for sparse polygenic prediction from machine learning.

Raben T, Lello L, Widen E, Hsu S Sci Rep. 2023; 13(1):11662.

PMID: 37468507 PMC: 10356957. DOI: 10.1038/s41598-023-37580-5.


Survey and Evaluation of Hypertension Machine Learning Research.

Du Toit C, Tran T, Deo N, Aryal S, Lip S, Sykes R J Am Heart Assoc. 2023; 12(9):e027896.

PMID: 37119074 PMC: 10227215. DOI: 10.1161/JAHA.122.027896.


References
1.
Martin A, Gignoux C, Walters R, Wojcik G, Neale B, Gravel S . Human Demographic History Impacts Genetic Risk Prediction across Diverse Populations. Am J Hum Genet. 2017; 100(4):635-649. PMC: 5384097. DOI: 10.1016/j.ajhg.2017.03.004. View

2.
Jacob H, Abrams K, Bick D, Brodie K, Dimmock D, Farrell M . Genomics in clinical practice: lessons from the front lines. Sci Transl Med. 2013; 5(194):194cm5. DOI: 10.1126/scitranslmed.3006468. View

3.
Barton N, Hermisson J, Nordborg M . Why structure matters. Elife. 2019; 8. PMC: 6428565. DOI: 10.7554/eLife.45380. View

4.
Ruan Y, Lin Y, Feng Y, Chen C, Lam M, Guo Z . Improving polygenic prediction in ancestrally diverse populations. Nat Genet. 2022; 54(5):573-580. PMC: 9117455. DOI: 10.1038/s41588-022-01054-7. View

5.
Lello L, Raben T, Hsu S . Sibling validation of polygenic risk scores and complex trait prediction. Sci Rep. 2020; 10(1):13190. PMC: 7411027. DOI: 10.1038/s41598-020-69927-7. View