» Articles » PMID: 19209722

A Bayesian Integration Model of High-throughput Proteomics and Metabolomics Data for Improved Early Detection of Microbial Infections

Abstract

High-throughput (HTP) technologies offer the capability to evaluate the genome, proteome, and metabolome of an organism at a global scale. This opens up new opportunities to define complex signatures of disease that involve signals from multiple types of biomolecules. However, integrating these data types is difficult due to the heterogeneity of the data. We present a Bayesian approach to integration that uses posterior probabilities to assign class memberships to samples using individual and multiple data sources; these probabilities are based on lower-level likelihood functions derived from standard statistical learning algorithms. We demonstrate this approach on microbial infections of mice, where the bronchial alveolar lavage fluid was analyzed by three HTP technologies, two proteomic and one metabolomic. We demonstrate that integration of the three datasets improves classification accuracy to approximately 89% from the best individual dataset at approximately 83%. In addition, we present a new visualization tool called Visual Integration for Bayesian Evaluation (VIBE) that allows the user to observe classification accuracies at the class level and evaluate classification accuracies on any subset of available data types based on the posterior probability models defined for the individual and integrated data.

Citing Articles

ProMetIS, deep phenotyping of mouse models by combined proteomics and metabolomics analysis.

Imbert A, Rompais M, Selloum M, Castelli F, Mouton-Barbosa E, Brandolini-Bunlon M Sci Data. 2021; 8(1):311.

PMID: 34862403 PMC: 8642540. DOI: 10.1038/s41597-021-01095-3.


Prediction of the development of islet autoantibodies through integration of environmental, genetic, and metabolic markers.

Webb-Robertson B, Bramer L, Stanfill B, Reehl S, Nakayasu E, Metz T J Diabetes. 2020; 13(2):143-153.

PMID: 33124145 PMC: 7818425. DOI: 10.1111/1753-0407.13093.


Predictive Modeling of Type 1 Diabetes Stages Using Disparate Data Sources.

Frohnert B, Webb-Robertson B, Bramer L, Reehl S, Waugh K, Steck A Diabetes. 2019; 69(2):238-248.

PMID: 31740441 PMC: 6971485. DOI: 10.2337/db18-1263.


Metabolomics in childhood diabetes.

Frohnert B, Rewers M Pediatr Diabetes. 2015; 17(1):3-14.

PMID: 26420304 PMC: 4703499. DOI: 10.1111/pedi.12323.


Mechanism-Based Classification of PAH Mixtures to Predict Carcinogenic Potential.

Tilton S, Siddens L, Krueger S, Larkin A, Lohr C, Williams D Toxicol Sci. 2015; 146(1):135-45.

PMID: 25908611 PMC: 4476464. DOI: 10.1093/toxsci/kfv080.


References
1.
Reif D, White B, Moore J . Integrated analysis of genetic, genomic and proteomic data. Expert Rev Proteomics. 2005; 1(1):67-75. DOI: 10.1586/14789450.1.1.67. View

2.
Pounds J, Flora J, Adkins J, Lee K, Rana G, Sengupta T . Characterization of the mouse bronchoalveolar lavage proteome by micro-capillary LC-FTICR mass spectrometry. J Chromatogr B Analyt Technol Biomed Life Sci. 2008; 864(1-2):95-101. DOI: 10.1016/j.jchromb.2008.01.044. View

3.
Lewis D, Jebara T, Noble W . Support vector machine learning from heterogeneous data: an empirical analysis using protein sequence and structure. Bioinformatics. 2006; 22(22):2753-60. DOI: 10.1093/bioinformatics/btl475. View

4.
Bernard A, Hartemink A . Informative structure priors: joint learning of dynamic regulatory networks from multiple types of data. Pac Symp Biocomput. 2005; :459-70. View

5.
R G Lanckriet G, De Bie T, Cristianini N, Jordan M, Noble W . A statistical framework for genomic data fusion. Bioinformatics. 2004; 20(16):2626-35. DOI: 10.1093/bioinformatics/bth294. View