» Articles » PMID: 29773078

Optimizing Taxonomic Classification of Marker-gene Amplicon Sequences with QIIME 2's Q2-feature-classifier Plugin

Overview
Journal Microbiome
Publisher Biomed Central
Specialties Genetics
Microbiology
Date 2018 May 19
PMID 29773078
Citations 1924
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Taxonomic classification of marker-gene sequences is an important step in microbiome analysis.

Results: We present q2-feature-classifier ( https://github.com/qiime2/q2-feature-classifier ), a QIIME 2 plugin containing several novel machine-learning and alignment-based methods for taxonomy classification. We evaluated and optimized several commonly used classification methods implemented in QIIME 1 (RDP, BLAST, UCLUST, and SortMeRNA) and several new methods implemented in QIIME 2 (a scikit-learn naive Bayes machine-learning classifier, and alignment-based taxonomy consensus methods based on VSEARCH, and BLAST+) for classification of bacterial 16S rRNA and fungal ITS marker-gene amplicon sequence data. The naive-Bayes, BLAST+-based, and VSEARCH-based classifiers implemented in QIIME 2 meet or exceed the species-level accuracy of other commonly used methods designed for classification of marker gene sequences that were evaluated in this work. These evaluations, based on 19 mock communities and error-free sequence simulations, including classification of simulated "novel" marker-gene sequences, are available in our extensible benchmarking framework, tax-credit ( https://github.com/caporaso-lab/tax-credit-data ).

Conclusions: Our results illustrate the importance of parameter tuning for optimizing classifier performance, and we make recommendations regarding parameter choices for these classifiers under a range of standard operating conditions. q2-feature-classifier and tax-credit are both free, open-source, BSD-licensed packages available on GitHub.

Citing Articles

Homogeneity Between Cervical and Vaginal Microbiomes and the Diagnostic Limitations of 16S Sequencing for STI Pathogens at Higher Ct Values.

Neidhofer C, Condic M, Hahn N, Otten L, Ralser D, Wetzig N Int J Mol Sci. 2025; 26(5).

PMID: 40076607 PMC: 11899988. DOI: 10.3390/ijms26051983.


The Gut Microbiota of the Greater Horseshoe Bat Confers Rapidly Corresponding Immune Cells in Mice.

Luo S, Huang X, Chen S, Li J, Wu H, He Y Animals (Basel). 2025; 15(5).

PMID: 40075967 PMC: 11899282. DOI: 10.3390/ani15050685.


Diversity and Structure of the Prokaryotic Community in Tropical Monomictic Reservoir.

Barjau-Aguilar M, Reyes-Hernandez A, Merino-Ibarra M, Vilaclara G, Ramirez-Zierold J, Alcantara-Hernandez R Microb Ecol. 2025; 88(1):12.

PMID: 40072582 PMC: 11903632. DOI: 10.1007/s00248-025-02508-1.


Gut microbiota analysis in cirrhosis and non-cirrhotic portal hypertension suggests that portal hypertension can be main factor of cirrhosis-specific dysbiosis.

Gulyaeva K, Nadinskaia M, Maslennikov R, Aleshina Y, Goptar I, Lukashev A Sci Rep. 2025; 15(1):8394.

PMID: 40069378 PMC: 11897210. DOI: 10.1038/s41598-025-92618-0.


Using gut microbiome metagenomic hypervariable features for diabetes screening and typing through supervised machine learning.

Chavarria X, Park H, Oh S, Kang D, Choi J, Kim M Microb Genom. 2025; 11(3).

PMID: 40063675 PMC: 11893737. DOI: 10.1099/mgen.0.001365.


References
1.
Lan Y, Wang Q, Cole J, Rosen G . Using the RDP classifier to predict taxonomic novelty and reduce the search space for finding novel organisms. PLoS One. 2012; 7(3):e32491. PMC: 3293824. DOI: 10.1371/journal.pone.0032491. View

2.
Maurice C, Haiser H, Turnbaugh P . Xenobiotics shape the physiology and gene expression of the active human gut microbiome. Cell. 2013; 152(1-2):39-50. PMC: 3552296. DOI: 10.1016/j.cell.2012.10.052. View

3.
Soergel D, Dey N, Knight R, Brenner S . Selection of primers for optimal taxonomic classification of environmental 16S rRNA gene sequences. ISME J. 2012; 6(7):1440-4. PMC: 3379642. DOI: 10.1038/ismej.2011.208. View

4.
Kopylova E, Noe L, Touzet H . SortMeRNA: fast and accurate filtering of ribosomal RNAs in metatranscriptomic data. Bioinformatics. 2012; 28(24):3211-7. DOI: 10.1093/bioinformatics/bts611. View

5.
Callahan B, McMurdie P, Rosen M, Han A, Johnson A, Holmes S . DADA2: High-resolution sample inference from Illumina amplicon data. Nat Methods. 2016; 13(7):581-3. PMC: 4927377. DOI: 10.1038/nmeth.3869. View