» Articles » PMID: 34732024

Effects of Sequence Diversity and Recombination on the Accuracy of Phylogenetic Trees Estimated by KSNP

Overview
Journal Cladistics
Specialty Biology
Date 2021 Nov 4
PMID 34732024
Citations 5
Authors
Affiliations
Soon will be listed here.
Abstract

kSNP v2 is a powerful tool for single nucleotide polymorphism (SNP) identification from complete microbial genomes and for estimating phylogenetic trees from the identified SNPs. kSNP can analyse finished genomes, genome assemblies, raw reads or any combination of those and does not require either genome alignment or reference genomes. This study uses sequence evolution simulations to evaluate the topological accuracy of kSNP trees and to assess the effects of diversity and recombination on that accuracy. The accuracies of kSNP trees are strongly affected by increasing diversity, with parsimony accuracy > maximum-likelihood accuracy > neighbour-joining accuracy. Accuracy is also strongly influenced by recombination; as recombination increases accuracy decreases. Reliable trees are arbitrarily defined as those that have ≥ 90% topological accuracy. It is determined that the best predictor of topological accuracy is the ratio of r/m, a measure of the effect of recombination, to FCK (the fraction of core kmers), a measure of diversity. Tools are available to allow investigators to determine both r/m and FCK, and the relationship between topological accuracy and the ratio of r/m to FCK is determined. The practical implication of this study is that kSNP is an effective tool for estimating phylogenetic trees from microbial genome sequences provided that both recombination and sequence diversity are within acceptable ranges.

Citing Articles

The First Complete Chloroplast Genome Sequence of subsp. Stapf (), the Putative Ancestor of the Genus .

Skuza L, Androsiuk P, Gastineau R, Achrem M, Paukszto L, Jastrzebski J Curr Issues Mol Biol. 2025; 47(1).

PMID: 39852179 PMC: 11764287. DOI: 10.3390/cimb47010064.


Building Phylogenetic Trees From Genome Sequences With kSNP4.

Hall B, Nisbet J Mol Biol Evol. 2023; 40(11).

PMID: 37948764 PMC: 10640685. DOI: 10.1093/molbev/msad235.


Amplified fragment length polymorphism and whole genome sequencing: a comparison of methods in the investigation of a nosocomial outbreak with vancomycin resistant enterococci.

Janes V, Notermans D, Spijkerman I, Visser C, Jakobs M, van Houdt R Antimicrob Resist Infect Control. 2019; 8:153.

PMID: 31572571 PMC: 6757385. DOI: 10.1186/s13756-019-0604-5.


Population Dynamics of Staphylococcus aureus in Cystic Fibrosis Patients To Determine Transmission Events by Use of Whole-Genome Sequencing.

Ankrum A, Hall B J Clin Microbiol. 2017; 55(7):2143-2152.

PMID: 28446577 PMC: 5483916. DOI: 10.1128/JCM.00164-17.


Whole-Genome Analysis of Antimicrobial-Resistant and Extraintestinal Pathogenic Escherichia coli in River Water.

Gomi R, Matsuda T, Matsumura Y, Yamamoto M, Tanaka M, Ichiyama S Appl Environ Microbiol. 2016; 83(5).

PMID: 27986723 PMC: 5311411. DOI: 10.1128/AEM.02703-16.