» Articles » PMID: 38785221

Tracking SARS-CoV-2 Variants of Concern in Wastewater: an Assessment of Nine Computational Tools Using Simulated Genomic Data

Abstract

Wastewater-based surveillance (WBS) is an important epidemiological and public health tool for tracking pathogens across the scale of a building, neighbourhood, city, or region. WBS gained widespread adoption globally during the SARS-CoV-2 pandemic for estimating community infection levels by qPCR. Sequencing pathogen genes or genomes from wastewater adds information about pathogen genetic diversity, which can be used to identify viral lineages (including variants of concern) that are circulating in a local population. Capturing the genetic diversity by WBS sequencing is not trivial, as wastewater samples often contain a diverse mixture of viral lineages with real mutations and sequencing errors, which must be deconvoluted computationally from short sequencing reads. In this study we assess nine different computational tools that have recently been developed to address this challenge. We simulated 100 wastewater sequence samples consisting of SARS-CoV-2 BA.1, BA.2, and Delta lineages, in various mixtures, as well as a Delta-Omicron recombinant and a synthetic 'novel' lineage. Most tools performed well in identifying the true lineages present and estimating their relative abundances and were generally robust to variation in sequencing depth and read length. While many tools identified lineages present down to 1 % frequency, results were more reliable above a 5 % threshold. The presence of an unknown synthetic lineage, which represents an unclassified SARS-CoV-2 lineage, increases the error in relative abundance estimates of other lineages, but the magnitude of this effect was small for most tools. The tools also varied in how they labelled novel synthetic lineages and recombinants. While our simulated dataset represents just one of many possible use cases for these methods, we hope it helps users understand potential sources of error or bias in wastewater sequencing analysis and to appreciate the commonalities and differences across methods.

Citing Articles

Genomic surveillance of Canadian airport wastewater samples allows early detection of emerging SARS-CoV-2 lineages.

Overton A, Knapp J, Lawal O, Gibson R, Fedynak A, Adebiyi A Sci Rep. 2024; 14(1):26534.

PMID: 39489759 PMC: 11532424. DOI: 10.1038/s41598-024-76925-6.


Real-Time Monitoring of SARS-CoV-2 Variants in Oklahoma Wastewater through Allele-Specific RT-qPCR.

Shelton K, Deshpande G, Sanchez G, Vogel J, Miller A, Florea G Microorganisms. 2024; 12(10).

PMID: 39458310 PMC: 11509313. DOI: 10.3390/microorganisms12102001.


Synthetic data: how could it be used in infectious disease research?.

Fragkouli S, Solanki D, Castro L, Psomopoulos F, Queralt-Rosinach N, Cirillo D Future Microbiol. 2024; 19(17):1439-1444.

PMID: 39345126 PMC: 11492709. DOI: 10.1080/17460913.2024.2400853.


Reconstructing SARS-CoV-2 lineages from mixed wastewater sequencing data.

Ellmen I, Overton A, Knapp J, Nash D, Ho H, Hungwe Y Sci Rep. 2024; 14(1):20273.

PMID: 39217200 PMC: 11365997. DOI: 10.1038/s41598-024-70416-4.


Impact of reference design on estimating SARS-CoV-2 lineage abundances from wastewater sequencing data.

Assmann E, Agrawal S, Orschler L, Bottcher S, Lackner S, Holzer M Gigascience. 2024; 13.

PMID: 39115959 PMC: 11308188. DOI: 10.1093/gigascience/giae051.


References
1.
Rios G, Lacoux C, Leclercq V, Diamant A, Lebrigand K, Lazuka A . Monitoring SARS-CoV-2 variants alterations in Nice neighborhoods by wastewater nanopore sequencing. Lancet Reg Health Eur. 2021; 10:100202. PMC: 8372489. DOI: 10.1016/j.lanepe.2021.100202. View

2.
Wu Y, Guo C, Tang L, Hong Z, Zhou J, Dong X . Prolonged presence of SARS-CoV-2 viral RNA in faecal samples. Lancet Gastroenterol Hepatol. 2020; 5(5):434-435. PMC: 7158584. DOI: 10.1016/S2468-1253(20)30083-2. View

3.
Jackson B, Boni M, Bull M, Colleran A, Colquhoun R, Darby A . Generation and transmission of interlineage recombinants in the SARS-CoV-2 pandemic. Cell. 2021; 184(20):5179-5188.e8. PMC: 8367733. DOI: 10.1016/j.cell.2021.08.014. View

4.
Gourle H, Karlsson-Lindsjo O, Hayer J, Bongcam-Rudloff E . Simulating Illumina metagenomic data with InSilicoSeq. Bioinformatics. 2018; 35(3):521-522. PMC: 6361232. DOI: 10.1093/bioinformatics/bty630. View

5.
Baaijens J, Zulli A, Ott I, Nika I, van der Lugt M, Petrone M . Lineage abundance estimation for SARS-CoV-2 in wastewater using transcriptome quantification techniques. Genome Biol. 2022; 23(1):236. PMC: 9643916. DOI: 10.1186/s13059-022-02805-9. View