» Articles » PMID: 39582023

NMFGOT: a Multi-view Learning Framework for the Microbiome and Metabolome Integrative Analysis with Optimal Transport Plan

Overview
Date 2024 Nov 24
PMID 39582023
Authors
Affiliations
Soon will be listed here.
Abstract

The rapid development of high-throughput sequencing techniques provides an unprecedented opportunity to generate biological insights into microbiome-related diseases. However, the relationships among microbes, metabolites and human microenvironment are extremely complex, making data analysis challenging. Here, we present NMFGOT, which is a versatile toolkit for the integrative analysis of microbiome and metabolome data from the same samples. NMFGOT is an unsupervised learning framework based on nonnegative matrix factorization with graph regularized optimal transport, where it utilizes the optimal transport plan to measure the probability distance between microbiome samples, which better dealt with the nonlinear high-order interactions among microbial taxa and metabolites. Moreover, it also includes a spatial regularization term to preserve the spatial consistency of samples in the embedding space across different data modalities. We implemented NMFGOT in several multi-omics microbiome datasets from multiple cohorts. The experimental results showed that NMFGOT consistently performed well compared with several recently published multi-omics integrating methods. Moreover, NMFGOT also facilitates downstream biological analysis, including pathway enrichment analysis and disease-specific metabolite-microbe association analysis. Using NMFGOT, we identified the significantly and stable metabolite-microbe associations in GC and ESRD diseases, which improves our understanding for the mechanisms of human complex diseases.

References
1.
Priya S, Burns M, Ward T, Mars R, Adamowicz B, Lock E . Identification of shared and disease-specific host gene-microbiome associations across human diseases using multi-omic integration. Nat Microbiol. 2022; 7(6):780-795. PMC: 9159953. DOI: 10.1038/s41564-022-01121-z. View

2.
Lee D, Seung H . Learning the parts of objects by non-negative matrix factorization. Nature. 1999; 401(6755):788-91. DOI: 10.1038/44565. View

3.
Dona A, Jimenez B, Schafer H, Humpfer E, Spraul M, Lewis M . Precision high-throughput proton NMR spectroscopy of human urine, serum, and plasma for large-scale metabolic phenotyping. Anal Chem. 2014; 86(19):9887-94. DOI: 10.1021/ac5025039. View

4.
Wang B, Mezlini A, Demir F, Fiume M, Tu Z, Brudno M . Similarity network fusion for aggregating data types on a genomic scale. Nat Methods. 2014; 11(3):333-7. DOI: 10.1038/nmeth.2810. View

5.
Pollen A, Nowakowski T, Shuga J, Wang X, Leyrat A, Lui J . Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex. Nat Biotechnol. 2014; 32(10):1053-8. PMC: 4191988. DOI: 10.1038/nbt.2967. View