» Articles » PMID: 23786768

Quikr: a Method for Rapid Reconstruction of Bacterial Communities Via Compressive Sensing

Overview
Journal Bioinformatics
Specialty Biology
Date 2013 Jun 22
PMID 23786768
Citations 16
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: Many metagenomic studies compare hundreds to thousands of environmental and health-related samples by extracting and sequencing their 16S rRNA amplicons and measuring their similarity using beta-diversity metrics. However, one of the first steps--to classify the operational taxonomic units within the sample--can be a computationally time-consuming task because most methods rely on computing the taxonomic assignment of each individual read out of tens to hundreds of thousands of reads.

Results: We introduce Quikr: a QUadratic, K-mer-based, Iterative, Reconstruction method, which computes a vector of taxonomic assignments and their proportions in the sample using an optimization technique motivated from the mathematical theory of compressive sensing. On both simulated and actual biological data, we demonstrate that Quikr typically has less error and is typically orders of magnitude faster than the most commonly used taxonomic assignment technique (the Ribosomal Database Project's Naïve Bayesian Classifier). Furthermore, the technique is shown to be unaffected by the presence of chimeras, thereby allowing for the circumvention of the time-intensive step of chimera filtering.

Availability: The Quikr computational package (in MATLAB, Octave, Python and C) for the Linux and Mac platforms is available at http://sourceforge.net/projects/quikr/.

Citing Articles

Comparison of Methods for Picking the Operational Taxonomic Units From Amplicon Sequences.

Wei Z, Zhang X, Cao M, Liu F, Qian Y, Zhang S Front Microbiol. 2021; 12:644012.

PMID: 33841367 PMC: 8024490. DOI: 10.3389/fmicb.2021.644012.


Phylogenetic double placement of mixed samples.

Balaban M, Mirarab S Bioinformatics. 2020; 36(Suppl_1):i335-i343.

PMID: 32657414 PMC: 7355250. DOI: 10.1093/bioinformatics/btaa489.


To Petabytes and beyond: recent advances in probabilistic and signal processing algorithms and their application to metagenomics.

Elworth R, Wang Q, Kota P, Barberan C, Coleman B, Balaji A Nucleic Acids Res. 2020; 48(10):5217-5234.

PMID: 32338745 PMC: 7261164. DOI: 10.1093/nar/gkaa265.


DMSC: A Dynamic Multi-Seeds Method for Clustering 16S rRNA Sequences Into OTUs.

Wei Z, Zhang S Front Microbiol. 2019; 10:428.

PMID: 30915052 PMC: 6422886. DOI: 10.3389/fmicb.2019.00428.


MetaPheno: A critical evaluation of deep learning and machine learning in metagenome-based disease prediction.

LaPierre N, Ju C, Zhou G, Wang W Methods. 2019; 166:74-82.

PMID: 30885720 PMC: 6708502. DOI: 10.1016/j.ymeth.2019.03.003.