» Articles » PMID: 32674515

Validation of Variant Assembly Using HAPHPIPE with Next-Generation Sequence Data from Viruses

Overview
Journal Viruses
Publisher MDPI
Specialty Microbiology
Date 2020 Jul 18
PMID 32674515
Citations 4
Authors
Affiliations
Soon will be listed here.
Abstract

Next-generation sequencing (NGS) offers a powerful opportunity to identify low-abundance, intra-host viral sequence variants, yet the focus of many bioinformatic tools on consensus sequence construction has precluded a thorough analysis of intra-host diversity. To take full advantage of the resolution of NGS data, we developed HAplotype PHylodynamics PIPEline (HAPHPIPE), an open-source tool for the de novo and reference-based assembly of viral NGS data, with both consensus sequence assembly and a focus on the quantification of intra-host variation through haplotype reconstruction. We validate and compare the consensus sequence assembly methods of HAPHPIPE to those of two alternative software packages, HyDRA and Geneious, using simulated HIV and empirical HIV, HCV, and SARS-CoV-2 datasets. Our validation methods included read mapping, genetic distance, and genetic diversity metrics. In simulated NGS data, HAPHPIPE generated consensus sequences significantly closer to the true consensus sequence than those produced by HyDRA and Geneious and performed comparably to Geneious for HIV sequences. Furthermore, using empirical data from multiple viruses, we demonstrate that HAPHPIPE can analyze larger sequence datasets due to its greater computational speed. Therefore, we contend that HAPHPIPE provides a more user-friendly platform for users with and without bioinformatics experience to implement current best practices for viral NGS assembly than other currently available options.

Citing Articles

Deviations in RSV epidemiological patterns and population structures in the United States following the COVID-19 pandemic.

Rios-Guzman E, Simons L, Dean T, Agnes F, Pawlowski A, Alisoltanidehkordi A Nat Commun. 2024; 15(1):3374.

PMID: 38643200 PMC: 11032338. DOI: 10.1038/s41467-024-47757-9.


ViralWasm: a client-side user-friendly web application suite for viral genomics.

Ji D, Aboukhalil R, Moshiri N Bioinformatics. 2024; 40(1).

PMID: 38200583 PMC: 10809900. DOI: 10.1093/bioinformatics/btae018.


Altered RSV Epidemiology and Genetic Diversity Following the COVID-19 Pandemic.

Hultquist J, Rios-Guzman E, Simons L, Dean T, Agnes F, Pawlowski A Res Sq. 2024; .

PMID: 38168164 PMC: 10760306. DOI: 10.21203/rs.3.rs-3712859/v1.


HAPHPIPE: Haplotype Reconstruction and Phylodynamics for Deep Sequencing of Intrahost Viral Populations.

Bendall M, Gibson K, Steiner M, Rentia U, Perez-Losada M, Crandall K Mol Biol Evol. 2020; 38(4):1677-1690.

PMID: 33367849 PMC: 8042772. DOI: 10.1093/molbev/msaa315.

References
1.
Redd A, Mullis C, Serwadda D, Kong X, Martens C, Ricklefs S . The rates of HIV superinfection and primary HIV incidence in a general population in Rakai, Uganda. J Infect Dis. 2012; 206(2):267-74. PMC: 3415936. DOI: 10.1093/infdis/jis325. View

2.
Nikolenko S, Korobeynikov A, Alekseyev M . BayesHammer: Bayesian clustering for error correction in single-cell sequencing. BMC Genomics. 2013; 14 Suppl 1:S7. PMC: 3549815. DOI: 10.1186/1471-2164-14-S1-S7. View

3.
Nagarajan N, Pop M . Sequence assembly demystified. Nat Rev Genet. 2013; 14(3):157-67. DOI: 10.1038/nrg3367. View

4.
Li H, Durbin R . Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009; 25(14):1754-60. PMC: 2705234. DOI: 10.1093/bioinformatics/btp324. View

5.
Hora B, Keating S, Chen Y, Sanchez A, Sabino E, Hunt G . Genetic Characterization of a Panel of Diverse HIV-1 Isolates at Seven International Sites. PLoS One. 2016; 11(6):e0157340. PMC: 4912073. DOI: 10.1371/journal.pone.0157340. View