» Articles » PMID: 27673566

Next Generation Sequencing Data of a Defined Microbial Mock Community

Overview
Journal Sci Data
Specialty Science
Date 2016 Sep 28
PMID 27673566
Citations 57
Authors
Affiliations
Soon will be listed here.
Abstract

Generating sequence data of a defined community composed of organisms with complete reference genomes is indispensable for the benchmarking of new genome sequence analysis methods, including assembly and binning tools. Moreover the validation of new sequencing library protocols and platforms to assess critical components such as sequencing errors and biases relies on such datasets. We here report the next generation metagenomic sequence data of a defined mock community (Mock Bacteria ARchaea Community; MBARC-26), composed of 23 bacterial and 3 archaeal strains with finished genomes. These strains span 10 phyla and 14 classes, a range of GC contents, genome sizes, repeat content and encompass a diverse abundance profile. Short read Illumina and long-read PacBio SMRT sequences of this mock community are described. These data represent a valuable resource for the scientific community, enabling extensive benchmarking and comparative evaluation of bioinformatics tools without the need to simulate data. As such, these data can aid in improving our current sequence data analysis toolkit and spur interest in the development of new tools.

Citing Articles

Bioindicator "fingerprints" of methane-emitting thermokarst features in Alaskan soils.

Smallwood C, Hasson N, Yang J, Schambach J, Bennett H, Ricken B Front Microbiol. 2025; 15:1462941.

PMID: 40059907 PMC: 11885255. DOI: 10.3389/fmicb.2024.1462941.


Flowtigs: Safety in flow decompositions for assembly graphs.

Sena F, Ingervo E, Khan S, Prjibelski A, Schmidt S, Tomescu A iScience. 2025; 27(12):111208.

PMID: 39759024 PMC: 11700653. DOI: 10.1016/j.isci.2024.111208.


MIMt: a curated 16S rRNA reference database with less redundancy and higher accuracy at species-level identification.

Cabezas M, Fonseca N, Munoz-Merida A Environ Microbiome. 2024; 19(1):88.

PMID: 39522045 PMC: 11550520. DOI: 10.1186/s40793-024-00634-w.


Unveiling errors in soil microbial community sequencing: a case for reference soils and improved diagnostics for nanopore sequencing.

Manter D, Reardon C, Ashworth A, Ibekwe A, Lehman R, Maul J Commun Biol. 2024; 7(1):913.

PMID: 39069530 PMC: 11284219. DOI: 10.1038/s42003-024-06594-8.


Metapresence: a tool for accurate species detection in metagenomics based on the genome-wide distribution of mapping reads.

Sanguineti D, Zampieri G, Treu L, Campanaro S mSystems. 2024; 9(8):e0021324.

PMID: 38980053 PMC: 11338496. DOI: 10.1128/msystems.00213-24.


References
1.
Pruesse E, Peplies J, Glockner F . SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes. Bioinformatics. 2012; 28(14):1823-9. PMC: 3389763. DOI: 10.1093/bioinformatics/bts252. View

2.
Willner D, Daly J, Whiley D, Grimwood K, Wainwright C, Hugenholtz P . Comparison of DNA extraction methods for microbial community profiling with an application to pediatric bronchoalveolar lavage samples. PLoS One. 2012; 7(4):e34605. PMC: 3326054. DOI: 10.1371/journal.pone.0034605. View

3.
Turnbaugh P, Quince C, Faith J, McHardy A, Yatsunenko T, Niazi F . Organismal, genetic, and transcriptional variation in the deeply sequenced gut microbiomes of identical twins. Proc Natl Acad Sci U S A. 2010; 107(16):7503-8. PMC: 2867707. DOI: 10.1073/pnas.1002355107. View

4.
Price M, Dehal P, Arkin A . FastTree 2--approximately maximum-likelihood trees for large alignments. PLoS One. 2010; 5(3):e9490. PMC: 2835736. DOI: 10.1371/journal.pone.0009490. View

5.
Singer E, Bushnell B, Coleman-Derr D, Bowman B, Bowers R, Levy A . High-resolution phylogenetic microbial community profiling. ISME J. 2016; 10(8):2020-32. PMC: 5029162. DOI: 10.1038/ismej.2015.249. View