» Articles » PMID: 38830935

An Interactive Atlas of Genomic, Proteomic, and Metabolomic Biomarkers Promotes the Potential of Proteins to Predict Complex Diseases

Overview
Journal Sci Rep
Specialty Science
Date 2024 Jun 3
PMID 38830935
Authors
Affiliations
Soon will be listed here.
Abstract

Multiomics analyses have identified multiple potential biomarkers of the incidence and prevalence of complex diseases. However, it is not known which type of biomarker is optimal for clinical purposes. Here, we make a systematic comparison of 90 million genetic variants, 1453 proteins, and 325 metabolites from 500,000 individuals with complex diseases from the UK Biobank. A machine learning pipeline consisting of data cleaning, data imputation, feature selection, and model training using cross-validation and comparison of the results on holdout test sets showed that proteins were most predictive, followed by metabolites, and genetic variants. Only five proteins per disease resulted in median (min-max) areas under the receiver operating characteristic curves for incidence of 0.79 (0.65-0.86) and 0.84 (0.70-0.91) for prevalence. In summary, our work suggests the potential of predicting complex diseases based on a limited number of proteins. We provide an interactive atlas (macd.shinyapps.io/ShinyApp/) to find genomic, proteomic, or metabolomic biomarkers for different complex diseases.

Citing Articles

Plasma Proteomic Signatures of Physical Activity Provide Insights into Biological Impacts of Physical Activity and its Protective Role Against Dementia.

Arani G, Arora A, Yang S, Wu J, Kraszewski J, Martins A medRxiv. 2025; .

PMID: 39867359 PMC: 11759254. DOI: 10.1101/2025.01.16.25320290.


Plasma protein-based and polygenic risk scores serve complementary roles in predicting inflammatory bowel disease.

Woerner J, Westbrook T, Jeong S, Shivakumar M, Greenplate A, Apostolidis S Pac Symp Biocomput. 2024; 30:522-534.

PMID: 39670393 PMC: 11649021.

References
1.
Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T, Tibshirani R . Missing value estimation methods for DNA microarrays. Bioinformatics. 2001; 17(6):520-5. DOI: 10.1093/bioinformatics/17.6.520. View

2.
Lambert S, Gil L, Jupp S, Ritchie S, Xu Y, Buniello A . The Polygenic Score Catalog as an open database for reproducibility and systematic evaluation. Nat Genet. 2021; 53(4):420-425. PMC: 11165303. DOI: 10.1038/s41588-021-00783-5. View

3.
Julkunen H, Cichonska A, Tiainen M, Koskela H, Nybo K, Makela V . Atlas of plasma NMR biomarkers for health and disease in 118,461 individuals from the UK Biobank. Nat Commun. 2023; 14(1):604. PMC: 9898515. DOI: 10.1038/s41467-023-36231-7. View

4.
Glaab E, Rauschenberger A, Banzi R, Gerardi C, Garcia P, Demotes J . Biomarker discovery studies for patient stratification using machine learning analysis of omics data: a scoping review. BMJ Open. 2021; 11(12):e053674. PMC: 8650485. DOI: 10.1136/bmjopen-2021-053674. View

5.
Savva K, Kawka M, Vadhwana B, Penumaka R, Patton I, Khan K . The Biomarker Toolkit - an evidence-based guideline to predict cancer biomarker success and guide development. BMC Med. 2023; 21(1):383. PMC: 10552368. DOI: 10.1186/s12916-023-03075-3. View