» Articles » PMID: 39654676

Optimizing Microbiome Reference Databases with PacBio Full-length 16S RRNA Sequencing for Enhanced Taxonomic Classification and Biomarker Discovery

Overview
Journal Front Microbiol
Specialty Microbiology
Date 2024 Dec 10
PMID 39654676
Authors
Affiliations
Soon will be listed here.
Abstract

Background: The study of the human microbiome is crucial for understanding disease mechanisms, identifying biomarkers, and guiding preventive measures. Advances in sequencing platforms, particularly 16S rRNA sequencing, have revolutionized microbiome research. Despite the benefits, large microbiome reference databases (DBs) pose challenges, including computational demands and potential inaccuracies. This study aimed to determine if full-length 16S rRNA sequencing data produced by PacBio could be used to optimize reference DBs and be applied to Illumina V3-V4 targeted sequencing data for microbial study.

Methods: Oral and gut microbiome data (PRJNA1049979) were retrieved from NCBI. DADA2 was applied to full-length 16S rRNA PacBio data to obtain amplicon sequencing variants (ASVs). The RDP reference DB was used to assign the ASVs, which were then used as a reference DB to train the classifier. QIIME2 was used for V3-V4 targeted Illumina data analysis. BLAST was used to analyze alignment statistics. Linear discriminant analysis Effect Size (LEfSe) was employed for discriminant analysis.

Results: ASVs produced by PacBio showed coverage of the oral microbiome similar to the Human Oral Microbiome Database. A phylogenetic tree was trimmed at various thresholds to obtain an optimized reference DB. This established method was then applied to gut microbiome data, and the optimized gut microbiome reference DB provided improved taxa classification and biomarker discovery efficiency.

Conclusion: Full-length 16S rRNA sequencing data produced by PacBio can be used to construct a microbiome reference DB. Utilizing an optimized reference DB can increase the accuracy of microbiome classification and enhance biomarker discovery.

References
1.
Satam H, Joshi K, Mangrolia U, Waghoo S, Zaidi G, Rawool S . Next-Generation Sequencing Technology: Current Trends and Advancements. Biology (Basel). 2023; 12(7). PMC: 10376292. DOI: 10.3390/biology12070997. View

2.
Sczyrba A, Hofmann P, Belmann P, Koslicki D, Janssen S, Droge J . Critical Assessment of Metagenome Interpretation-a benchmark of metagenomics software. Nat Methods. 2017; 14(11):1063-1071. PMC: 5903868. DOI: 10.1038/nmeth.4458. View

3.
Veziant J, Villeger R, Barnich N, Bonnet M . Gut Microbiota as Potential Biomarker and/or Therapeutic Target to Improve the Management of Cancer: Focus on Colibactin-Producing in Colorectal Cancer. Cancers (Basel). 2021; 13(9). PMC: 8124679. DOI: 10.3390/cancers13092215. View

4.
DeSantis T, Hugenholtz P, Larsen N, Rojas M, Brodie E, Keller K . Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl Environ Microbiol. 2006; 72(7):5069-72. PMC: 1489311. DOI: 10.1128/AEM.03006-05. View

5.
Letunic I, Bork P . Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 2021; 49(W1):W293-W296. PMC: 8265157. DOI: 10.1093/nar/gkab301. View