» Articles » PMID: 20671025

Error Correction of Next-generation Sequencing Data and Reliable Estimation of HIV Quasispecies

Overview
Specialty Biochemistry
Date 2010 Jul 31
PMID 20671025
Citations 113
Authors
Affiliations
Soon will be listed here.
Abstract

Next-generation sequencing technologies can be used to analyse genetically heterogeneous samples at unprecedented detail. The high coverage achievable with these methods enables the detection of many low-frequency variants. However, sequencing errors complicate the analysis of mixed populations and result in inflated estimates of genetic diversity. We developed a probabilistic Bayesian approach to minimize the effect of errors on the detection of minority variants. We applied it to pyrosequencing data obtained from a 1.5-kb-fragment of the HIV-1 gag/pol gene in two control and two clinical samples. The effect of PCR amplification was analysed. Error correction resulted in a two- and five-fold decrease of the pyrosequencing base substitution rate, from 0.05% to 0.03% and from 0.25% to 0.05% in the non-PCR and PCR-amplified samples, respectively. We were able to detect viral clones as rare as 0.1% with perfect sequence reconstruction. Probabilistic haplotype inference outperforms the counting-based calling method in both precision and recall. Genetic diversity observed within and between two clinical samples resulted in various patterns of phenotypic drug resistance and suggests a close epidemiological link. We conclude that pyrosequencing can be used to investigate genetically diverse samples with high accuracy if technical errors are properly treated.

Citing Articles

Next-Generation Sequencing Reveals a High Frequency of HIV-1 Minority Variants and an Expanded Drug Resistance Profile among Individuals on First-Line ART.

Nannyonjo M, Omooja J, Bugembe D, Bbosa N, Lunkuse S, Nabirye S Viruses. 2024; 16(9).

PMID: 39339930 PMC: 11437406. DOI: 10.3390/v16091454.


A study on factors influencing delayed sputum conversion in newly diagnosed pulmonary tuberculosis based on bacteriology and genomics.

Pang M, Dai X, Wang N, Yi J, Sun S, Miao H Sci Rep. 2024; 14(1):18550.

PMID: 39122761 PMC: 11315884. DOI: 10.1038/s41598-024-69636-5.


Methods to improve the accuracy of next-generation sequencing.

Cheng C, Fei Z, Xiao P Front Bioeng Biotechnol. 2023; 11:982111.

PMID: 36741756 PMC: 9895957. DOI: 10.3389/fbioe.2023.982111.


K-Mer Spectrum-Based Error Correction Algorithm for Next-Generation Sequencing Data.

AlEisa H, Hamad S, Elhadad A Comput Intell Neurosci. 2022; 2022:8077664.

PMID: 35875730 PMC: 9303089. DOI: 10.1155/2022/8077664.


Current Methods for Recombination Detection in Bacteria.

Shikov A, Malovichko Y, Nizhnikov A, Antonets K Int J Mol Sci. 2022; 23(11).

PMID: 35682936 PMC: 9181119. DOI: 10.3390/ijms23116257.


References
1.
Hoffmann C, Minkah N, Leipzig J, Wang G, Arens M, Tebas P . DNA bar coding and pyrosequencing to identify rare HIV drug resistance mutations. Nucleic Acids Res. 2007; 35(13):e91. PMC: 1934997. DOI: 10.1093/nar/gkm435. View

2.
Beerenwinkel N, Daumer M, Oette M, Korn K, Hoffmann D, Kaiser R . Geno2pheno: Estimating phenotypic drug resistance from HIV-1 genotypes. Nucleic Acids Res. 2003; 31(13):3850-5. PMC: 168981. DOI: 10.1093/nar/gkg575. View

3.
Margeridon-Thermet S, Shulman N, Ahmed A, Shahriar R, Liu T, Wang C . Ultra-deep pyrosequencing of hepatitis B virus quasispecies from nucleoside and nucleotide reverse-transcriptase inhibitor (NRTI)-treated patients and NRTI-naive patients. J Infect Dis. 2009; 199(9):1275-85. PMC: 3353721. DOI: 10.1086/597808. View

4.
Domingo E, Martin V, Perales C, Grande-Perez A, Garcia-Arriaza J, Arias A . Viruses as quasispecies: biological implications. Curr Top Microbiol Immunol. 2006; 299:51-82. PMC: 7120838. DOI: 10.1007/3-540-26397-7_3. View

5.
Shah S, Morin R, Khattra J, Prentice L, Pugh T, Burleigh A . Mutational evolution in a lobular breast tumour profiled at single nucleotide resolution. Nature. 2009; 461(7265):809-13. DOI: 10.1038/nature08489. View