Methods for Phylogenetic Analysis of Microbiome Data
Overview
Parasitology
Authors
Affiliations
How does knowing the evolutionary history of microorganisms affect our analysis of microbiological datasets? Depending on the research question, the common ancestry of microorganisms can be a source of confounding variation, or a scaffolding used for inference. For example, when performing regression on traits, common ancestry is a source of dependence among observations, whereas when searching for clades with correlated abundances, common ancestry is the scaffolding for inference. The common ancestry of microorganisms and their genes are organized in trees-phylogenies-which can and should be incorporated into analyses of microbial datasets. While there has been a recent expansion of phylogenetically informed analytical tools, little guidance exists for which method best answers which biological questions. Here, we review methods for phylogeny-aware analyses of microbiome datasets, considerations for choosing the appropriate method and challenges inherent in these methods. We introduce a conceptual organization of these tools, breaking them down into phylogenetic comparative methods, ancestral state reconstruction and analysis of phylogenetic variables and distances, and provide examples in Supplementary Online Tutorials. Careful consideration of the research question and ecological and evolutionary assumptions will help researchers choose a phylogeny and appropriate methods to produce accurate, biologically informative and previously unreported insights.
PhyloMix: enhancing microbiome-trait association prediction through phylogeny-mixing augmentation.
Jiang Y, Liao D, Zhu Q, Lu Y Bioinformatics. 2025; 41(2).
PMID: 39799515 PMC: 11849959. DOI: 10.1093/bioinformatics/btaf014.
Applying rearrangement distances to enable plasmid epidemiology with pling.
Frolova D, Lima L, Roberts L, Bohnenkamper L, Wittler R, Stoye J Microb Genom. 2024; 10(10).
PMID: 39401066 PMC: 11472880. DOI: 10.1099/mgen.0.001300.
Phylogenetic association analysis with conditional rank correlation.
Wang S, Yuan B, Cai T, Li H Biometrika. 2024; 111(3):881-902.
PMID: 39239268 PMC: 11373757. DOI: 10.1093/biomet/asad075.
Zhang S, Li Y, Cai Y, Kang X, Feng Y, Li Y Front Genet. 2024; 15:1361952.
PMID: 38495668 PMC: 10940399. DOI: 10.3389/fgene.2024.1361952.
GENERALIZED MATRIX DECOMPOSITION REGRESSION: ESTIMATION AND INFERENCE FOR TWO-WAY STRUCTURED DATA.
Wang Y, Shojaie A, Randolph T, Knight P, Ma J Ann Appl Stat. 2023; 17(4):2944-2969.
PMID: 38149262 PMC: 10751029. DOI: 10.1214/23-aoas1746.