Genome-scale Coestimation of Species and Gene Trees
Overview
Affiliations
Comparisons of gene trees and species trees are key to understanding major processes of genome evolution such as gene duplication and loss. Because current methods to reconstruct phylogenies fail to model the two-way dependency between gene trees and the species tree, they often misrepresent gene and species histories. We present a new probabilistic model to jointly infer rooted species and gene trees for dozens of genomes and thousands of gene families. We use simulations to show that this method accurately infers the species tree and gene trees, is robust to misspecification of the models of sequence and gene family evolution, and provides a precise historic record of gene duplications and losses throughout genome evolution. We simultaneously reconstruct the history of mammalian species and their genes based on 36 completely sequenced genomes, and use the reconstructed gene trees to infer the gene content and organization of ancestral mammalian genomes. We show that our method yields a more accurate picture of ancestral genomes than the trees available in the authoritative database Ensembl.
wQFM-DISCO: DISCO-enabled wQFM improves phylogenomic analyses despite the presence of paralogs.
Hakim S, Ratul M, Bayzid M Bioinform Adv. 2024; 4(1):vbae189.
PMID: 39664861 PMC: 11634537. DOI: 10.1093/bioadv/vbae189.
Challenges in Assembling the Dated Tree of Life.
Schrago C, Mello B Genome Biol Evol. 2024; 16(10).
PMID: 39475308 PMC: 11523137. DOI: 10.1093/gbe/evae229.
The Meaning and Measure of Concordance Factors in Phylogenomics.
Lanfear R, Hahn M Mol Biol Evol. 2024; 41(11).
PMID: 39418118 PMC: 11532913. DOI: 10.1093/molbev/msae214.
Williams T, Davin A, Szantho L, Stamatakis A, Wahl N, Woodcroft B ISME J. 2024; 18(1.
PMID: 39001714 PMC: 11293204. DOI: 10.1093/ismejo/wrae129.
Morel B, Williams T, Stamatakis A, Szollosi G Bioinformatics. 2024; 40(4).
PMID: 38514421 PMC: 10990685. DOI: 10.1093/bioinformatics/btae162.