» Articles » PMID: 37628698

StrainIQ: A Novel -Gram-Based Method for Taxonomic Profiling of Human Microbiota at the Strain Level

Overview
Journal Genes (Basel)
Publisher MDPI
Date 2023 Aug 26
PMID 37628698
Authors
Affiliations
Soon will be listed here.
Abstract

The emergence of next-generation sequencing (NGS) technology has greatly influenced microbiome research and led to the development of novel bioinformatics tools to deeply analyze metagenomics datasets. Identifying strain-level variations in microbial communities is important to understanding the onset and progression of diseases, host-pathogen interrelationships, and drug resistance, in addition to designing new therapeutic regimens. In this study, we developed a novel tool called StrainIQ (strain identification and quantification) based on a new -gram-based (series of number of adjacent nucleotides in the DNA sequence) algorithm for predicting and quantifying strain-level taxa from whole-genome metagenomic sequencing data. We thoroughly evaluated our method using simulated and mock metagenomic datasets and compared its performance with existing methods. On average, it showed 85.8% sensitivity and 78.2% specificity on simulated datasets. It also showed higher specificity and sensitivity using -gram models built from reduced reference genomes and on models with lower coverage sequencing data. It outperforms alternative approaches in genus- and strain-level prediction and strain abundance estimation. Overall, the results show that StrainIQ achieves high accuracy by implementing customized model-building and is an efficient tool for site-specific microbial community profiling.

Citing Articles

DNA N-gram analysis framework (DNAnamer): A generalized N-gram frequency analysis framework for the supervised classification of DNA sequences.

Malamon J Heliyon. 2024; 10(17):e36914.

PMID: 39281454 PMC: 11399624. DOI: 10.1016/j.heliyon.2024.e36914.

References
1.
Davis C . The Gut Microbiome and Its Role in Obesity. Nutr Today. 2016; 51(4):167-174. PMC: 5082693. DOI: 10.1097/NT.0000000000000167. View

2.
Gourle H, Karlsson-Lindsjo O, Hayer J, Bongcam-Rudloff E . Simulating Illumina metagenomic data with InSilicoSeq. Bioinformatics. 2018; 35(3):521-522. PMC: 6361232. DOI: 10.1093/bioinformatics/bty630. View

3.
Mayer E, Tillisch K, Gupta A . Gut/brain axis and the microbiota. J Clin Invest. 2015; 125(3):926-38. PMC: 4362231. DOI: 10.1172/JCI76304. View

4.
Ounit R, Lonardi S . Higher classification sensitivity of short metagenomic reads with CLARK-S. Bioinformatics. 2016; 32(24):3823-3825. DOI: 10.1093/bioinformatics/btw542. View

5.
Breitwieser F, Baker D, Salzberg S . KrakenUniq: confident and fast metagenomics classification using unique k-mer counts. Genome Biol. 2018; 19(1):198. PMC: 6238331. DOI: 10.1186/s13059-018-1568-0. View