» Articles » PMID: 29910992

Taxonomy Annotation and Guide Tree Errors in 16S RRNA Databases

Overview
Journal PeerJ
Date 2018 Jun 19
PMID 29910992
Citations 82
Authors
Affiliations
Soon will be listed here.
Abstract

Sequencing of the 16S ribosomal RNA (rRNA) gene is widely used to survey microbial communities. Specialized 16S rRNA databases have been developed to support this approach including Greengenes, RDP and SILVA. Most taxonomy annotations in these databases are predictions from sequence rather than authoritative assignments based on studies of type strains or isolates. In this work, I investigated the taxonomy annotations and guide trees provided by these databases. Using a blinded test, I estimated that the annotation error rate of the RDP database is ∼10%. The branching orders of the Greengenes and SILVA guide trees were found to disagree at comparable rates with each other and with taxonomy annotations according to the training set (authoritative reference) provided by RDP, indicating that the trees have comparable quality. Pervasive conflicts between tree branching order and type strain taxonomies strongly suggest that the guide trees are unreliable guides to phylogeny. I found 249,490 identical sequences with conflicting annotations in SILVA v128 and Greengenes v13.5 at ranks up to phylum (7,804 conflicts), indicating that the annotation error rate in these databases is ∼17%.

Citing Articles

Evaluation of 16S rRNA genes sequences and genome-based analysis for identification of non-pathogenic .

Kislichkina A, Sizova A, Skryabin Y, Dentovskaya S, Anisimov A Front Microbiol. 2025; 15():1519733.

PMID: 39845053 PMC: 11753223. DOI: 10.3389/fmicb.2024.1519733.


MultiTax-human: an extensive and high-resolution human-related full-length 16S rRNA reference database and taxonomy.

Bao Z, Zhang B, Yao J, Li M Microbiol Spectr. 2025; 13(2):e0131224.

PMID: 39817732 PMC: 11792508. DOI: 10.1128/spectrum.01312-24.


Ferulic Acid Relieves the Oxidative Stress Induced by Oxidized Fish Oil in Oriental River Prawn () with an Emphasis on Lipid Metabolism and Gut Microbiota.

Liu X, Sun C, Zhou Q, Zheng X, Jiang S, Wang A Antioxidants (Basel). 2025; 13(12.

PMID: 39765792 PMC: 11672775. DOI: 10.3390/antiox13121463.


Approximate nearest neighbor graph provides fast and efficient embedding with applications for large-scale biological data.

Zhao J, Pierre Both J, Konstantinidis K NAR Genom Bioinform. 2024; 6(4):lqae172.

PMID: 39703432 PMC: 11655291. DOI: 10.1093/nargab/lqae172.


Interplay of human ABCC11 transporter gene variants with axillary skin microbiome functional genomics.

Stevens B, Roesch L Sci Rep. 2024; 14(1):28037.

PMID: 39543265 PMC: 11564711. DOI: 10.1038/s41598-024-78711-w.


References
1.
Parte A . LPSN--list of prokaryotic names with standing in nomenclature. Nucleic Acids Res. 2013; 42(Database issue):D613-6. PMC: 3965054. DOI: 10.1093/nar/gkt1111. View

2.
Escobar-Paramo P, Giudicelli C, Parsot C, Denamur E . The evolutionary history of Shigella and enteroinvasive Escherichia coli revised. J Mol Evol. 2003; 57(2):140-8. DOI: 10.1007/s00239-003-2460-3. View

3.
Yilmaz P, Parfrey L, Yarza P, Gerken J, Pruesse E, Quast C . The SILVA and "All-species Living Tree Project (LTP)" taxonomic frameworks. Nucleic Acids Res. 2013; 42(Database issue):D643-8. PMC: 3965112. DOI: 10.1093/nar/gkt1209. View

4.
Gogarten J, Townsend J . Horizontal gene transfer, genome innovation and evolution. Nat Rev Microbiol. 2005; 3(9):679-87. DOI: 10.1038/nrmicro1204. View

5.
Pruesse E, Quast C, Knittel K, Fuchs B, Ludwig W, Peplies J . SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Res. 2007; 35(21):7188-96. PMC: 2175337. DOI: 10.1093/nar/gkm864. View