» Articles » PMID: 35332213

The ViReflow Pipeline Enables User Friendly Large Scale Viral Consensus Genome Reconstruction

Overview
Journal Sci Rep
Specialty Science
Date 2022 Mar 25
PMID 35332213
Authors
Affiliations
Soon will be listed here.
Abstract

Throughout the COVID-19 pandemic, massive sequencing and data sharing efforts enabled the real-time surveillance of novel SARS-CoV-2 strains throughout the world, the results of which provided public health officials with actionable information to prevent the spread of the virus. However, with great sequencing comes great computation, and while cloud computing platforms bring high-performance computing directly into the hands of all who seek it, optimal design and configuration of a cloud compute cluster requires significant system administration expertise. We developed ViReflow, a user-friendly viral consensus sequence reconstruction pipeline enabling rapid analysis of viral sequence datasets leveraging Amazon Web Services (AWS) cloud compute resources and the Reflow system. ViReflow was developed specifically in response to the COVID-19 pandemic, but it is general to any viral pathogen. Importantly, when utilized with sufficient compute resources, ViReflow can trim, map, call variants, and call consensus sequences from amplicon sequence data from 1000 SARS-CoV-2 samples at 1000X depth in < 10 min, with no user intervention. ViReflow's simplicity, flexibility, and scalability make it an ideal tool for viral molecular epidemiological efforts.

Citing Articles

ViralFlow v1.0-a computational workflow for streamlining viral genomic surveillance.

da Silva A, da Silva Neto A, Aksenen C, Jeronimo P, Dezordi F, Almeida S NAR Genom Bioinform. 2024; 6(2):lqae056.

PMID: 38800829 PMC: 11127631. DOI: 10.1093/nargab/lqae056.


A hepatitis B virus (HBV) sequence variation graph improves alignment and sample-specific consensus sequence construction.

Duchen D, Clipman S, Vergara C, Thio C, Thomas D, Duggal P PLoS One. 2024; 19(4):e0301069.

PMID: 38669259 PMC: 11051683. DOI: 10.1371/journal.pone.0301069.


ViralWasm: a client-side user-friendly web application suite for viral genomics.

Ji D, Aboukhalil R, Moshiri N Bioinformatics. 2024; 40(1).

PMID: 38200583 PMC: 10809900. DOI: 10.1093/bioinformatics/btae018.


COWID: an efficient cloud-based genomics workflow for scalable identification of SARS-COV-2.

Lim H, Fann Y, Lee Y Brief Bioinform. 2023; 24(5).

PMID: 37738400 PMC: 10516370. DOI: 10.1093/bib/bbad280.


A hepatitis B virus (HBV) sequence variation graph improves sequence alignment and sample-specific consensus sequence construction for genetic analysis of HBV.

Duchen D, Clipman S, Vergara C, Thio C, Thomas D, Duggal P bioRxiv. 2023; .

PMID: 36711598 PMC: 9882026. DOI: 10.1101/2023.01.11.523611.


References
1.
Li D, Liu C, Luo R, Sadakane K, Lam T . MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics. 2015; 31(10):1674-6. DOI: 10.1093/bioinformatics/btv033. View

2.
Topfer A, Marschall T, Bull R, Luciani F, Schonhuth A, Beerenwinkel N . Viral quasispecies assembly via maximal clique enumeration. PLoS Comput Biol. 2014; 10(3):e1003515. PMC: 3967922. DOI: 10.1371/journal.pcbi.1003515. View

3.
Kim D, Paggi J, Park C, Bennett C, Salzberg S . Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol. 2019; 37(8):907-915. PMC: 7605509. DOI: 10.1038/s41587-019-0201-4. View

4.
Chen S, Zhou Y, Chen Y, Gu J . fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018; 34(17):i884-i890. PMC: 6129281. DOI: 10.1093/bioinformatics/bty560. View

5.
Chikhi R, Rizk G . Space-efficient and exact de Bruijn graph representation based on a Bloom filter. Algorithms Mol Biol. 2013; 8(1):22. PMC: 3848682. DOI: 10.1186/1748-7188-8-22. View