Improved Gene Tree Error Correction in the Presence of Horizontal Gene Transfer
Overview
Affiliations
Motivation: The accurate inference of gene trees is a necessary step in many evolutionary studies. Although the problem of accurate gene tree inference has received considerable attention, most existing methods are only applicable to gene families unaffected by horizontal gene transfer. As a result, the accurate inference of gene trees affected by horizontal gene transfer remains a largely unaddressed problem.
Results: In this study, we introduce a new and highly effective method for gene tree error correction in the presence of horizontal gene transfer. Our method efficiently models horizontal gene transfers, gene duplications and losses, and uses a statistical hypothesis testing framework [Shimodaira-Hasegawa (SH) test] to balance sequence likelihood with topological information from a known species tree. Using a thorough simulation study, we show that existing phylogenetic methods yield inaccurate gene trees when applied to horizontally transferred gene families and that our method dramatically improves gene tree accuracy. We apply our method to a dataset of 11 cyanobacterial species and demonstrate the large impact of gene tree accuracy on downstream evolutionary analyses.
Availability And Implementation: An implementation of our method is available at http://compbio.mit.edu/treefix-dtl/
Contact: : mukul@engr.uconn.edu or manoli@mit.edu
Supplementary Information: Supplementary data are available at Bioinformatics online.
Investigating Additive and Replacing Horizontal Gene Transfers Using Phylogenies and Whole Genomes.
Kloub L, Gosselin S, Graf J, Gogarten J, Bansal M Genome Biol Evol. 2024; 16(9).
PMID: 39163267 PMC: 11375855. DOI: 10.1093/gbe/evae180.
Williams T, Davin A, Szantho L, Stamatakis A, Wahl N, Woodcroft B ISME J. 2024; 18(1.
PMID: 39001714 PMC: 11293204. DOI: 10.1093/ismejo/wrae129.
A complete theoretical framework for inferring horizontal gene transfers using partial order sets.
Belal N, Heath L PLoS One. 2023; 18(3):e0281824.
PMID: 36961781 PMC: 10038315. DOI: 10.1371/journal.pone.0281824.
Menet H, Daubin V, Tannier E PLoS Comput Biol. 2022; 18(11):e1010621.
PMID: 36327227 PMC: 9632901. DOI: 10.1371/journal.pcbi.1010621.
Zaman S, Sledzieski S, Berger B, Wu Y, Bansal M J Comput Biol. 2022; 30(1):3-20.
PMID: 36125448 PMC: 10081712. DOI: 10.1089/cmb.2021.0507.