» Articles » PMID: 34785722

Enhancing Diversity Analysis by Repeatedly Rarefying Next Generation Sequencing Data Describing Microbial Communities

Overview
Journal Sci Rep
Specialty Science
Date 2021 Nov 17
PMID 34785722
Citations 62
Authors
Affiliations
Soon will be listed here.
Abstract

Amplicon sequencing has revolutionized our ability to study DNA collected from environmental samples by providing a rapid and sensitive technique for microbial community analysis that eliminates the challenges associated with lab cultivation and taxonomic identification through microscopy. In water resources management, it can be especially useful to evaluate ecosystem shifts in response to natural and anthropogenic landscape disturbances to signal potential water quality concerns, such as the detection of toxic cyanobacteria or pathogenic bacteria. Amplicon sequencing data consist of discrete counts of sequence reads, the sum of which is the library size. Groups of samples typically have different library sizes that are not representative of biological variation; library size normalization is required to meaningfully compare diversity between them. Rarefaction is a widely used normalization technique that involves the random subsampling of sequences from the initial sample library to a selected normalized library size. This process is often dismissed as statistically invalid because subsampling effectively discards a portion of the observed sequences, yet it remains prevalent in practice and the suitability of rarefying, relative to many other normalization approaches, for diversity analysis has been argued. Here, repeated rarefying is proposed as a tool to normalize library sizes for diversity analyses. This enables (i) proportionate representation of all observed sequences and (ii) characterization of the random variation introduced to diversity analyses by rarefying to a smaller library size shared by all samples. While many deterministic data transformations are not tailored to produce equal library sizes, repeatedly rarefying reflects the probabilistic process by which amplicon sequencing data are obtained as a representation of the amplified source microbial community. Specifically, it evaluates which data might have been obtained if a particular sample's library size had been smaller and allows graphical representation of the effects of this library size normalization process upon diversity analysis results.

Citing Articles

Size fractionation informs microbial community composition and interactions in the eastern tropical North Pacific Ocean.

Thompson M, Valentine D, Peng X FEMS Microbes. 2025; 5:xtae028.

PMID: 40034844 PMC: 11873797. DOI: 10.1093/femsmc/xtae028.


Bacteria invade the brain following intracortical microelectrode implantation, inducing gut-brain axis disruption and contributing to reduced microelectrode performance.

Hoeferlin G, Grabinski S, Druschel L, Duncan J, Burkhart G, Weagraff G Nat Commun. 2025; 16(1):1829.

PMID: 39979293 PMC: 11842729. DOI: 10.1038/s41467-025-56979-4.


Skin microbiomes of frogs vary among individuals and body regions, revealing differences that reflect known patterns of chytrid infection.

Ghose S, Eisen J bioRxiv. 2025; .

PMID: 39975414 PMC: 11839087. DOI: 10.1101/2025.02.05.636728.


Culturomics- and metagenomics-based insights into the soil microbiome preservation and application for sustainable agriculture.

Clagnan E, Costanzo M, Visca A, Di Gregorio L, Tabacchioni S, Colantoni E Front Microbiol. 2024; 15:1473666.

PMID: 39526137 PMC: 11544545. DOI: 10.3389/fmicb.2024.1473666.


Female reproductive tract microbiota varies with MHC profile.

Leclaire S, Bandekar M, Rowe M, Ritari J, Jokiniemi A, Partanen J Proc Biol Sci. 2024; 291(2033):20241334.

PMID: 39471862 PMC: 11521592. DOI: 10.1098/rspb.2024.1334.


References
1.
Gray M, Sankoff D, Cedergren R . On the evolutionary descent of organisms and organelles: a global phylogeny based on a highly conserved structural core in small subunit ribosomal RNA. Nucleic Acids Res. 1984; 12(14):5837-52. PMC: 320035. DOI: 10.1093/nar/12.14.5837. View

2.
Lam T, Mei R, Wu Z, Lee P, Liu W, Lee P . Superior resolution characterisation of microbial diversity in anaerobic digesters using full-length 16S rRNA gene amplicon sequencing. Water Res. 2020; 178:115815. DOI: 10.1016/j.watres.2020.115815. View

3.
Gloor G, Macklaim J, Pawlowsky-Glahn V, Egozcue J . Microbiome Datasets Are Compositional: And This Is Not Optional. Front Microbiol. 2017; 8:2224. PMC: 5695134. DOI: 10.3389/fmicb.2017.02224. View

4.
Zhang L, Fang W, Li X, Lu W, Li J . Strong linkages between dissolved organic matter and the aquatic bacterial community in an urban river. Water Res. 2020; 184:116089. DOI: 10.1016/j.watres.2020.116089. View

5.
Field K, Olsen G, Lane D, Giovannoni S, Ghiselin M, Raff E . Molecular phylogeny of the animal kingdom. Science. 1988; 239(4841 Pt 1):748-53. DOI: 10.1126/science.3277277. View