Probabilistic Orthology Analysis
Overview
Authors
Affiliations
Orthology analysis aims at identifying orthologous genes and gene products from different organisms and, therefore, is a powerful tool in modern computational and experimental biology. Although reconciliation-based orthology methods are generally considered more accurate than distance-based ones, the traditional parsimony-based implementation of reconciliation-based orthology analysis (most parsimonious reconciliation [MPR]) suffers from a number of shortcomings. For example, 1) it is limited to orthology predictions from the reconciliation that minimizes the number of gene duplication and loss events, 2) it cannot evaluate the support of this reconciliation in relation to the other reconciliations, and 3) it cannot make use of prior knowledge (e.g., about species divergence times) that provides auxiliary information for orthology predictions. We present a probabilistic approach to reconciliation-based orthology analysis that addresses all these issues by estimating orthology probabilities. The method is based on the gene evolution model, an explicit evolutionary model for gene duplication and gene loss inside a species tree, that generalizes the standard birth-death process. We describe the probabilistic approach to orthology analysis using 2 experimental data sets and show that the use of orthology probabilities allows a more informative analysis than MPR and, in particular, that it is less sensitive to taxon sampling problems. We generalize these anecdotal observations and show, using data generated under biologically realistic conditions, that MPR give false orthology predictions at a substantial frequency. Last, we provide a new orthology prediction method that allows an orthology and paralogy classification with any chosen sensitivity/specificity combination from the spectra of achievable combinations. We conclude that probabilistic orthology analysis is a strong and more advanced alternative to traditional orthology analysis and that it provides a framework for sophisticated comparative studies of processes in genome evolution.
ASTRAL-Pro: Quartet-Based Species-Tree Inference despite Paralogy.
Zhang C, Scornavacca C, Molloy E, Mirarab S Mol Biol Evol. 2020; 37(11):3292-3307.
PMID: 32886770 PMC: 7751180. DOI: 10.1093/molbev/msaa139.
Morel B, Kozlov A, Stamatakis A, Szollosi G Mol Biol Evol. 2020; 37(9):2763-2774.
PMID: 32502238 PMC: 8312565. DOI: 10.1093/molbev/msaa141.
Horizontal gene transfer of Chlamydia: Novel insights from tree reconciliation.
Kim H, Kwak W, Yoon S, Kang D, Kim H PLoS One. 2018; 13(4):e0195139.
PMID: 29621277 PMC: 5886423. DOI: 10.1371/journal.pone.0195139.
Isometric gene tree reconciliation revisited.
Brejova B, Gafurov A, Pardubska D, Sabo M, Vinar T Algorithms Mol Biol. 2017; 12:17.
PMID: 28630644 PMC: 5470333. DOI: 10.1186/s13015-017-0108-x.
Hellmuth M, Stadler P, Wieseke N J Math Biol. 2016; 75(1):199-237.
PMID: 27904954 DOI: 10.1007/s00285-016-1084-3.