» Articles » PMID: 25586220

Insight into Biases and Sequencing Errors for Amplicon Sequencing with the Illumina MiSeq Platform

Overview
Specialty Biochemistry
Date 2015 Jan 15
PMID 25586220
Citations 324
Authors
Affiliations
Soon will be listed here.
Abstract

With read lengths of currently up to 2 × 300 bp, high throughput and low sequencing costs Illumina's MiSeq is becoming one of the most utilized sequencing platforms worldwide. The platform is manageable and affordable even for smaller labs. This enables quick turnaround on a broad range of applications such as targeted gene sequencing, metagenomics, small genome sequencing and clinical molecular diagnostics. However, Illumina error profiles are still poorly understood and programs are therefore not designed for the idiosyncrasies of Illumina data. A better knowledge of the error patterns is essential for sequence analysis and vital if we are to draw valid conclusions. Studying true genetic variation in a population sample is fundamental for understanding diseases, evolution and origin. We conducted a large study on the error patterns for the MiSeq based on 16S rRNA amplicon sequencing data. We tested state-of-the-art library preparation methods for amplicon sequencing and showed that the library preparation method and the choice of primers are the most significant sources of bias and cause distinct error patterns. Furthermore we tested the efficiency of various error correction strategies and identified quality trimming (Sickle) combined with error correction (BayesHammer) followed by read overlapping (PANDAseq) as the most successful approach, reducing substitution error rates on average by 93%.

Citing Articles

Homogeneity Between Cervical and Vaginal Microbiomes and the Diagnostic Limitations of 16S Sequencing for STI Pathogens at Higher Ct Values.

Neidhofer C, Condic M, Hahn N, Otten L, Ralser D, Wetzig N Int J Mol Sci. 2025; 26(5).

PMID: 40076607 PMC: 11899988. DOI: 10.3390/ijms26051983.


Microbiota transplantation for cotton leaf curl disease suppression-core microbiome and transcriptome dynamics.

Badar A, Aqueel R, Nawaz A, Zeeshan Ijaz U, Malik K Commun Biol. 2025; 8(1):380.

PMID: 40050684 PMC: 11885576. DOI: 10.1038/s42003-025-07812-7.


Next-Generation Sequencing Methods to Determine the Accuracy of Retroviral Reverse Transcriptases: Advantages and Limitations.

Martinez Del Rio J, Menendez-Arias L Viruses. 2025; 17(2).

PMID: 40006928 PMC: 11861041. DOI: 10.3390/v17020173.


Variation of gene ratios in mock communities constructed with purified 16S rRNA during processing.

Nammoura Neto G, Schneider R Sci Rep. 2024; 14(1):31577.

PMID: 39738093 PMC: 11686170. DOI: 10.1038/s41598-024-61614-1.


Untrimmed ITS2 metabarcode sequences cause artificially reduced abundances of specific fungal taxa.

Kyle K, Klassen J Appl Environ Microbiol. 2024; 91(1):e0153724.

PMID: 39723817 PMC: 11784184. DOI: 10.1128/aem.01537-24.


References
1.
Huse S, Huber J, Morrison H, Sogin M, Welch D . Accuracy and quality of massively parallel DNA pyrosequencing. Genome Biol. 2007; 8(7):R143. PMC: 2323236. DOI: 10.1186/gb-2007-8-7-r143. View

2.
Wang X, Blades N, Ding J, Sultana R, Parmigiani G . Estimation of sequencing error rates in short reads. BMC Bioinformatics. 2012; 13:185. PMC: 3495688. DOI: 10.1186/1471-2105-13-185. View

3.
Shakya M, Quince C, Campbell J, Yang Z, Schadt C, Podar M . Comparative metagenomic and rRNA microbial diversity characterization using archaeal and bacterial synthetic communities. Environ Microbiol. 2013; 15(6):1882-99. PMC: 3665634. DOI: 10.1111/1462-2920.12086. View

4.
Eren A, Vineis J, Morrison H, Sogin M . A filtering method to generate high quality short reads using illumina paired-end technology. PLoS One. 2013; 8(6):e66643. PMC: 3684618. DOI: 10.1371/journal.pone.0066643. View

5.
Kozich J, Westcott S, Baxter N, Highlander S, Schloss P . Development of a dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq Illumina sequencing platform. Appl Environ Microbiol. 2013; 79(17):5112-20. PMC: 3753973. DOI: 10.1128/AEM.01043-13. View