» Articles » PMID: 25799056

Not Seeing the Forest for the Trees: Size of the Minimum Spanning Trees (MSTs) Forest and Branch Significance in MST-based Phylogenetic Analysis

Overview
Journal PLoS One
Date 2015 Mar 24
PMID 25799056
Citations 4
Authors
Affiliations
Soon will be listed here.
Abstract

Trees, including minimum spanning trees (MSTs), are commonly used in phylogenetic studies. But, for the research community, it may be unclear that the presented tree is just a hypothesis, chosen from among many possible alternatives. In this scenario, it is important to quantify our confidence in both the trees and the branches/edges included in such trees. In this paper, we address this problem for MSTs by introducing a new edge betweenness metric for undirected and weighted graphs. This spanning edge betweenness metric is defined as the fraction of equivalent MSTs where a given edge is present. The metric provides a per edge statistic that is similar to that of the bootstrap approach frequently used in phylogenetics to support the grouping of taxa. We provide methods for the exact computation of this metric based on the well known Kirchhoff's matrix tree theorem. Moreover, we implement and make available a module for the PHYLOViZ software and evaluate the proposed metric concerning both effectiveness and computational performance. Analysis of trees generated using multilocus sequence typing data (MLST) and the goeBURST algorithm revealed that the space of possible MSTs in real data sets is extremely large. Selection of the edge to be represented using bootstrap could lead to unreliable results since alternative edges are present in the same fraction of equivalent MSTs. The choice of the MST to be presented, results from criteria implemented in the algorithm that must be based in biologically plausible models.

Citing Articles

Web tools to fight pandemics: the COVID-19 experience.

Mercatelli D, Holding A, Giorgi F Brief Bioinform. 2020; 22(2):690-700.

PMID: 33057582 PMC: 7665357. DOI: 10.1093/bib/bbaa261.


In-Depth Longitudinal Study of Listeria monocytogenes ST9 Isolates from the Meat Processing Industry: Resolving Diversity and Transmission Patterns Using Whole-Genome Sequencing.

Fagerlund A, Langsrud S, Moretro T Appl Environ Microbiol. 2020; 86(14).

PMID: 32414794 PMC: 7357480. DOI: 10.1128/AEM.00579-20.


Degree and centrality-based approaches in network-based variable selection: Insights from the Singapore Longitudinal Aging Study.

Valenzuela J, Monterola C, Tong V, Fulop T, Ng T, Larbi A PLoS One. 2019; 14(7):e0219186.

PMID: 31318894 PMC: 6638841. DOI: 10.1371/journal.pone.0219186.


Evidence for Host-Genotype Associations of Borrelia burgdorferi Sensu Stricto.

Mechai S, Margos G, Feil E, Barairo N, Lindsay L, Michel P PLoS One. 2016; 11(2):e0149345.

PMID: 26901761 PMC: 4763156. DOI: 10.1371/journal.pone.0149345.

References
1.
Smith J, Feil E, Smith N . Population structure and evolutionary dynamics of pathogenic bacteria. Bioessays. 2000; 22(12):1115-22. DOI: 10.1002/1521-1878(200012)22:12<1115::AID-BIES9>3.0.CO;2-R. View

2.
Girvan M, Newman M . Community structure in social and biological networks. Proc Natl Acad Sci U S A. 2002; 99(12):7821-6. PMC: 122977. DOI: 10.1073/pnas.122653799. View

3.
Feil E, Li B, Aanensen D, Hanage W, Spratt B . eBURST: inferring patterns of evolutionary descent among clusters of related bacterial genotypes from multilocus sequence typing data. J Bacteriol. 2004; 186(5):1518-30. PMC: 344416. DOI: 10.1128/JB.186.5.1518-1530.2004. View

4.
Wilson D, Gabriel E, Leatherbarrow A, Cheesbrough J, Gee S, Bolton E . Rapid evolution and the importance of recombination to the gastroenteric pathogen Campylobacter jejuni. Mol Biol Evol. 2008; 26(2):385-97. PMC: 2639114. DOI: 10.1093/molbev/msn264. View

5.
Francisco A, Bugalho M, Ramirez M, Carrico J . Global optimal eBURST analysis of multilocus typing data using a graphic matroid approach. BMC Bioinformatics. 2009; 10:152. PMC: 2705362. DOI: 10.1186/1471-2105-10-152. View