» Articles » PMID: 39192191

RefMLST: Reference-based Multilocus Sequence Typing Enables Universal Bacterial Typing

Overview
Publisher Biomed Central
Specialty Biology
Date 2024 Aug 27
PMID 39192191
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Commonly used approaches for genomic investigation of bacterial outbreaks, including SNP and gene-by-gene approaches, are limited by the requirement for background genomes and curated allele schemes, respectively. As a result, they only work on a select subset of known organisms, and fail on novel or less studied pathogens. We introduce refMLST, a gene-by-gene approach using the reference genome of a bacterium to form a scalable, reproducible and robust method to perform outbreak investigation.

Results: When applied to multiple outbreak causing bacteria including 1263 Salmonella enterica, 331 Yersinia enterocolitica and 6526 Campylobacter jejuni genomes, refMLST enabled consistent clustering, improved resolution, and faster processing in comparison to commonly used tools like chewieSnake.

Conclusions: refMLST is a novel multilocus sequence typing approach that is applicable to any bacterial species with a public reference genome, does not require a curated scheme, and automatically accounts for genetic recombination.

Availability And Implementation: refMLST is freely available for academic use at https://bugseq.com/academic .

References
1.
Kohl T, Harmsen D, Rothganger J, Walker T, Diel R, Niemann S . Harmonized Genome Wide Typing of Tubercle Bacilli Using a Web-Based Gene-By-Gene Nomenclature System. EBioMedicine. 2018; 34:131-138. PMC: 6116475. DOI: 10.1016/j.ebiom.2018.07.030. View

2.
Li W, ONeill K, Haft D, DiCuccio M, Chetvernin V, Badretdin A . RefSeq: expanding the Prokaryotic Genome Annotation Pipeline reach with protein family model curation. Nucleic Acids Res. 2020; 49(D1):D1020-D1028. PMC: 7779008. DOI: 10.1093/nar/gkaa1105. View

3.
Jain C, Rodriguez-R L, Phillippy A, Konstantinidis K, Aluru S . High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun. 2018; 9(1):5114. PMC: 6269478. DOI: 10.1038/s41467-018-07641-9. View

4.
Steinegger M, Soding J . MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat Biotechnol. 2017; 35(11):1026-1028. DOI: 10.1038/nbt.3988. View

5.
Petkau A, Mabon P, Sieffert C, Knox N, Cabral J, Iskander M . SNVPhyl: a single nucleotide variant phylogenomics pipeline for microbial genomic epidemiology. Microb Genom. 2017; 3(6):e000116. PMC: 5628696. DOI: 10.1099/mgen.0.000116. View