» Articles » PMID: 27822553

Mockrobiota: a Public Resource for Microbiome Bioinformatics Benchmarking

Overview
Journal mSystems
Specialty Microbiology
Date 2016 Nov 9
PMID 27822553
Citations 44
Authors
Affiliations
Soon will be listed here.
Abstract

Mock communities are an important tool for validating, optimizing, and comparing bioinformatics methods for microbial community analysis. We present mockrobiota, a public resource for sharing, validating, and documenting mock community data resources, available at http://caporaso-lab.github.io/mockrobiota/. The materials contained in mockrobiota include data set and sample metadata, expected composition data (taxonomy or gene annotations or reference sequences for mock community members), and links to raw data (e.g., raw sequence data) for each mock community data set. mockrobiota does not supply physical sample materials directly, but the data set metadata included for each mock community indicate whether physical sample materials are available. At the time of this writing, mockrobiota contains 11 mock community data sets with known species compositions, including bacterial, archaeal, and eukaryotic mock communities, analyzed by high-throughput marker gene sequencing. The availability of standard and public mock community data will facilitate ongoing method optimizations, comparisons across studies that share source data, and greater transparency and access and eliminate redundancy. These are also valuable resources for bioinformatics teaching and training. This dynamic resource is intended to expand and evolve to meet the changing needs of the omics community.

Citing Articles

Organelles in the ointment: improved detection of cryptic mitochondrial reads resolves many unknown sequences in cross-species microbiome analyses.

Sonett D, Brown T, Bengtsson-Palme J, Padilla-Gamino J, Zaneveld J ISME Commun. 2024; 4(1):ycae114.

PMID: 39660011 PMC: 11631352. DOI: 10.1093/ismeco/ycae114.


Missing microbial eukaryotes and misleading meta-omic conclusions.

Krinos A, Mars Brisbin M, Hu S, Cohen N, Rynearson T, Follows M Nat Commun. 2024; 15(1):9873.

PMID: 39543100 PMC: 11564645. DOI: 10.1038/s41467-024-52212-w.


Mock community taxonomic classification performance of publicly available shotgun metagenomics pipelines.

Valencia E, Maki K, Dootz J, Barb J Sci Data. 2024; 11(1):81.

PMID: 38233447 PMC: 10794705. DOI: 10.1038/s41597-023-02877-7.


GSR-DB: a manually curated and optimized taxonomical database for 16S rRNA amplicon analysis.

Molano L, Vega-Abellaneda S, Manichanh C mSystems. 2024; 9(2):e0095023.

PMID: 38189256 PMC: 10946287. DOI: 10.1128/msystems.00950-23.


Inferring microbial co-occurrence networks from amplicon data: a systematic evaluation.

Kishore D, Birzu G, Hu Z, DeLisi C, Korolev K, Segre D mSystems. 2023; 8(4):e0096122.

PMID: 37338270 PMC: 10469762. DOI: 10.1128/msystems.00961-22.


References
1.
Caporaso J, Lauber C, Walters W, Berg-Lyons D, Lozupone C, Turnbaugh P . Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proc Natl Acad Sci U S A. 2010; 108 Suppl 1:4516-22. PMC: 3063599. DOI: 10.1073/pnas.1000080107. View

2.
McDonald D, Price M, Goodrich J, Nawrocki E, DeSantis T, Probst A . An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea. ISME J. 2011; 6(3):610-8. PMC: 3280142. DOI: 10.1038/ismej.2011.139. View

3.
Schloss P, Gevers D, Westcott S . Reducing the effects of PCR amplification and sequencing artifacts on 16S rRNA-based studies. PLoS One. 2011; 6(12):e27310. PMC: 3237409. DOI: 10.1371/journal.pone.0027310. View

4.
Huang W, Li L, Myers J, Marth G . ART: a next-generation sequencing read simulator. Bioinformatics. 2011; 28(4):593-4. PMC: 3278762. DOI: 10.1093/bioinformatics/btr708. View

5.
Angly F, Willner D, Rohwer F, Hugenholtz P, Tyson G . Grinder: a versatile amplicon and shotgun sequence simulator. Nucleic Acids Res. 2012; 40(12):e94. PMC: 3384353. DOI: 10.1093/nar/gks251. View