» Articles » PMID: 38528097

Comparison of the Effectiveness of Different Normalization Methods for Metagenomic Cross-study Phenotype Prediction Under Heterogeneity

Overview
Journal Sci Rep
Specialty Science
Date 2024 Mar 26
PMID 38528097
Authors
Affiliations
Soon will be listed here.
Abstract

The human microbiome, comprising microorganisms residing within and on the human body, plays a crucial role in various physiological processes and has been linked to numerous diseases. To analyze microbiome data, it is essential to account for inherent heterogeneity and variability across samples. Normalization methods have been proposed to mitigate these variations and enhance comparability. However, the performance of these methods in predicting binary phenotypes remains understudied. This study systematically evaluates different normalization methods in microbiome data analysis and their impact on disease prediction. Our findings highlight the strengths and limitations of scaling, compositional data analysis, transformation, and batch correction methods. Scaling methods like TMM show consistent performance, while compositional data analysis methods exhibit mixed results. Transformation methods, such as Blom and NPN, demonstrate promise in capturing complex associations. Batch correction methods, including BMC and Limma, consistently outperform other approaches. However, the influence of normalization methods is constrained by population effects, disease effects, and batch effects. These results provide insights for selecting appropriate normalization approaches in microbiome research, improving predictive models, and advancing personalized medicine. Future research should explore larger and more diverse datasets and develop tailored normalization strategies for microbiome data analysis.

Citing Articles

Disrupted microbial cross-feeding and altered L-phenylalanine consumption in people living with HIV.

Nguyen H, Kim W Brief Bioinform. 2025; 26(2).

PMID: 40072847 PMC: 11899578. DOI: 10.1093/bib/bbaf111.


Domain adaptation in small-scale and heterogeneous biological datasets.

Orouji S, Liu M, Korem T, Peters M Sci Adv. 2024; 10(51):eadp6040.

PMID: 39705361 PMC: 11661433. DOI: 10.1126/sciadv.adp6040.


Review and revamp of compositional data transformation: A new framework combining proportion conversion and contrast transformation.

Zhang Y, Schluter J, Zhang L, Cao X, Jenq R, Feng H Comput Struct Biotechnol J. 2024; 23:4088-4107.

PMID: 39624165 PMC: 11609487. DOI: 10.1016/j.csbj.2024.11.003.


Evaluation of normalization methods for predicting quantitative phenotypes in metagenomic data analysis.

Wang B, Luan Y Front Genet. 2024; 15:1369628.

PMID: 38903761 PMC: 11188486. DOI: 10.3389/fgene.2024.1369628.

References
1.
Feng Q, Liang S, Jia H, Stadlmayr A, Tang L, Lan Z . Gut microbiome development along the colorectal adenoma-carcinoma sequence. Nat Commun. 2015; 6:6528. DOI: 10.1038/ncomms7528. View

2.
Vogtmann E, Hua X, Zeller G, Sunagawa S, Voigt A, Hercog R . Colorectal Cancer and the Human Gut Microbiome: Reproducibility with Whole-Genome Shotgun Sequencing. PLoS One. 2016; 11(5):e0155362. PMC: 4865240. DOI: 10.1371/journal.pone.0155362. View

3.
Love M, Huber W, Anders S . Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014; 15(12):550. PMC: 4302049. DOI: 10.1186/s13059-014-0550-8. View

4.
Paradis E, Schliep K . ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics. 2018; 35(3):526-528. DOI: 10.1093/bioinformatics/bty633. View

5.
Wensel C, Pluznick J, Salzberg S, Sears C . Next-generation sequencing: insights to advance clinical investigations of the microbiome. J Clin Invest. 2022; 132(7). PMC: 8970668. DOI: 10.1172/JCI154944. View