» Articles » PMID: 21635750

Using Affinity Propagation for Identifying Subspecies Among Clonal Organisms: Lessons from M. Tuberculosis

Overview
Publisher Biomed Central
Specialty Biology
Date 2011 Jun 4
PMID 21635750
Citations 7
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Classification and naming is a key step in the analysis, understanding and adequate management of living organisms. However, where to set limits between groups can be puzzling especially in clonal organisms. Within the Mycobacterium tuberculosis complex (MTC), the etiological agent of tuberculosis (TB), experts have first identified several groups according to their pattern at repetitive sequences, especially at the CRISPR locus (spoligotyping), and to their epidemiological relevance. Most groups such as "Beijing" found good support when tested with other loci. However, other groups such as T family and T1 subfamily (belonging to the "Euro-American" lineage) correspond to non-monophyletic groups and still need to be refined. Here, we propose to use a method called Affinity Propagation that has been successfully used in image categorization to identify relevant patterns at the CRISPR locus in MTC.

Results: To adequately infer the relative divergence time between strains, we used a distance method inspired by the recent evolutionary model by Reyes et al. We first confirm that this method performs better than the Jaccard index commonly used to compare spoligotype patterns. Second, we document the support of each spoligotype family among the previous classification using affinity propagation on the international spoligotyping database SpolDB4. This allowed us to propose a consensus assignation for all SpolDB4 spoligotypes. Third, we propose new signatures to subclassify the T family.

Conclusion: Altogether, this study shows how the new clustering algorithm Affinity Propagation can help building or refining clonal organims classifications. It also describes well-supported families and subfamilies among M. tuberculosis complex, especially inside the modern "Euro-American" lineage.

Citing Articles

Prioritized candidate causal haplotype blocks in plant genome-wide association studies.

Wu X, Jiang W, Fragoso C, Huang J, Zhou G, Zhao H PLoS Genet. 2022; 18(10):e1010437.

PMID: 36251695 PMC: 9612827. DOI: 10.1371/journal.pgen.1010437.


Novel methods included in SpolLineages tool for fast and precise prediction of Mycobacterium tuberculosis complex spoligotype families.

Couvin D, Segretier W, Stattner E, Rastogi N Database (Oxford). 2020; 2020.

PMID: 33320180 PMC: 7737520. DOI: 10.1093/database/baaa108.


Using affinity propagation clustering for identifying bacterial clades and subclades with whole-genome sequences of Francisella tularensis.

Busch A, Homeier-Bachmann T, Abdel-Glil M, Hackbart A, Hotzel H, Tomaso H PLoS Negl Trop Dis. 2020; 14(9):e0008018.

PMID: 32991594 PMC: 7523947. DOI: 10.1371/journal.pntd.0008018.


Genomics and Machine Learning for Taxonomy Consensus: The Mycobacterium tuberculosis Complex Paradigm.

Aze J, Sola C, Zhang J, Lafosse-Marin F, Yasmin M, Siddiqui R PLoS One. 2015; 10(7):e0130912.

PMID: 26154264 PMC: 4496040. DOI: 10.1371/journal.pone.0130912.


Strain classification of Mycobacterium tuberculosis isolates in Brazil based on genotypes obtained by spoligotyping, mycobacterial interspersed repetitive unit typing and the presence of large sequence and single nucleotide polymorphism.

Vasconcellos S, Acosta C, Gomes L, Conceicao E, Lima K, de Araujo M PLoS One. 2014; 9(10):e107747.

PMID: 25314118 PMC: 4196770. DOI: 10.1371/journal.pone.0107747.


References
1.
Frey B, Dueck D . Clustering by passing messages between data points. Science. 2007; 315(5814):972-6. DOI: 10.1126/science.1136800. View

2.
Abadia E, Zhang J, Dos Vultos T, Ritacco V, Kremer K, Aktas E . Resolving lineage assignation on Mycobacterium tuberculosis clinical isolates classified by spoligotyping with a new high-throughput 3R SNPs based method. Infect Genet Evol. 2010; 10(7):1066-74. DOI: 10.1016/j.meegid.2010.07.006. View

3.
Liu F, Barrangou R, Gerner-Smidt P, Ribot E, Knabel S, Dudley E . Novel virulence gene and clustered regularly interspaced short palindromic repeat (CRISPR) multilocus sequence typing scheme for subtyping of the major serovars of Salmonella enterica subsp. enterica. Appl Environ Microbiol. 2011; 77(6):1946-56. PMC: 3067318. DOI: 10.1128/AEM.02625-10. View

4.
Zozio T, Allix C, Gunal S, Saribas Z, Alp A, Durmaz R . Genotyping of Mycobacterium tuberculosis clinical isolates in two cities of Turkey: description of a new family of genotypes that is phylogeographically specific for Asia Minor. BMC Microbiol. 2005; 5:44. PMC: 1192800. DOI: 10.1186/1471-2180-5-44. View

5.
Eisenach K, Crawford J, Bates J . Repetitive DNA sequences as probes for Mycobacterium tuberculosis. J Clin Microbiol. 1988; 26(11):2240-5. PMC: 266867. DOI: 10.1128/jcm.26.11.2240-2245.1988. View