» Articles » PMID: 18808330

Reconciliation with Non-binary Species Trees

Overview
Journal J Comput Biol
Date 2008 Sep 24
PMID 18808330
Citations 70
Authors
Affiliations
Soon will be listed here.
Abstract

Reconciliation extracts information from the topological incongruence between gene and species trees to infer duplications and losses in the history of a gene family. The inferred duplication-loss histories provide valuable information for a broad range of biological applications, including ortholog identification, estimating gene duplication times, and rooting and correcting gene trees. While reconciliation for binary trees is a tractable and well studied problem, there are no algorithms for reconciliation with non-binary species trees. Yet a striking proportion of species trees are non-binary. For example, 64% of branch points in the NCBI taxonomy have three or more children. When applied to non-binary species trees, current algorithms overestimate the number of duplications because they cannot distinguish between duplication and incomplete lineage sorting. We present the first algorithms for reconciling binary gene trees with non-binary species trees under a duplication-loss parsimony model. Our algorithms utilize an efficient mapping from gene to species trees to infer the minimum number of duplications in O(|V(G) | x (k(S) + h(S))) time, where |V(G)| is the number of nodes in the gene tree, h(S) is the height of the species tree and k(S) is the size of its largest polytomy. We present a dynamic programming algorithm which also minimizes the total number of losses. Although this algorithm is exponential in the size of the largest polytomy, it performs well in practice for polytomies with outdegree of 12 or less. We also present a heuristic which estimates the minimal number of losses in polynomial time. In empirical tests, this algorithm finds an optimal loss history 99% of the time. Our algorithms have been implemented in NOTUNG, a robust, production quality, tree-fitting program, which provides a graphical user interface for exploratory analysis and also supports automated, high-throughput analysis of large data sets.

Citing Articles

Inferences on the evolution of the ascorbic acid synthesis pathway in insects using Phylogenetic Tree Collapser (PTC), a tool for the automated collapsing of phylogenetic trees using taxonomic information.

Glez-Pena D, Lopez-Fernandez H, Duque P, Vieira C, Vieira J J Integr Bioinform. 2024; 21(2).

PMID: 39054685 PMC: 11377030. DOI: 10.1515/jib-2023-0051.


The Theory of Gene Family Histories.

Hellmuth M, Stadler P Methods Mol Biol. 2024; 2802:1-32.

PMID: 38819554 DOI: 10.1007/978-1-0716-3838-5_1.


Bioinformatic Characterization and Molecular Evolution of the Hemoglobins.

Montes-Rodriguez I, Cadilla C, Lopez-Garriga J, Gonzalez-Mendez R Genes (Basel). 2022; 13(11).

PMID: 36360278 PMC: 9690805. DOI: 10.3390/genes13112041.


Expansion and Accelerated Evolution of 9-Exon Odorant Receptors in Polistes Paper Wasps.

Legan A, Jernigan C, Miller S, Fuchs M, Sheehan M Mol Biol Evol. 2021; 38(9):3832-3846.

PMID: 34151983 PMC: 8383895. DOI: 10.1093/molbev/msab023.


Structural variation and evolution of chloroplast in green algae.

Qi F, Zhao Y, Zhao N, Wang K, Li Z, Wang Y PeerJ. 2021; 9:e11524.

PMID: 34131524 PMC: 8176911. DOI: 10.7717/peerj.11524.


References
1.
Salzburger W, Meyer A, Baric S, Verheyen E, Sturmbauer C . Phylogeny of the Lake Tanganyika cichlid species flock and its relationship to the Central and East African haplochromine cichlid fish faunas. Syst Biol. 2002; 51(1):113-35. DOI: 10.1080/106351502753475907. View

2.
Bourgon R, Delorenzi M, Sargeant T, Hodder A, Crabb B, Speed T . The serine repeat antigen (SERA) gene family phylogeny in Plasmodium: the impact of GC content and reconciliation of gene and species trees. Mol Biol Evol. 2004; 21(11):2161-71. DOI: 10.1093/molbev/msh228. View

3.
Hoelzer G, Meinick D . Patterns of speciation and limits to phylogenetic resolution. Trends Ecol Evol. 2011; 9(3):104-7. DOI: 10.1016/0169-5347(94)90207-0. View

4.
Chen K, Durand D, Farach-Colton M . NOTUNG: a program for dating gene duplications and optimizing gene family trees. J Comput Biol. 2000; 7(3-4):429-47. DOI: 10.1089/106652700750050871. View

5.
Tajima F . Evolutionary relationship of DNA sequences in finite populations. Genetics. 1983; 105(2):437-60. PMC: 1202167. DOI: 10.1093/genetics/105.2.437. View