» Articles » PMID: 31349684

SiNPle: Fast and Sensitive Variant Calling for Deep Sequencing Data

Overview
Journal Genes (Basel)
Publisher MDPI
Date 2019 Jul 28
PMID 31349684
Citations 7
Authors
Affiliations
Soon will be listed here.
Abstract

Current high-throughput sequencing technologies can generate sequence data and provide information on the genetic composition of samples at very high coverage. Deep sequencing approaches enable the detection of rare variants in heterogeneous samples, such as viral quasi-species, but also have the undesired effect of amplifying sequencing errors and artefacts. Distinguishing real variants from such noise is not straightforward. Variant callers that can handle pooled samples can be in trouble at extremely high read depths, while at lower depths sensitivity is often sacrificed to specificity. In this paper, we propose SiNPle (Simplified Inference of Novel Polymorphisms from Large coveragE), a fast and effective software for variant calling. SiNPle is based on a simplified Bayesian approach to compute the posterior probability that a variant is not generated by sequencing errors or PCR artefacts. The Bayesian model takes into consideration individual base qualities as well as their distribution, the baseline error rates during both the sequencing and the PCR stage, the prior distribution of variant frequencies and their strandedness. Our approach leads to an approximate but extremely fast computation of posterior probabilities even for very high coverage data, since the expression for the posterior distribution is a simple analytical formula in terms of summary statistics for the variants appearing at each site in the genome. These statistics can be used to filter out putative SNPs and indels according to the required level of sensitivity. We tested SiNPle on several simulated and real-life viral datasets to show that it is faster and more sensitive than existing methods. The source code for SiNPle is freely available to download and compile, or as a Conda/Bioconda package.

Citing Articles

Deletion of the s2m RNA Structure in the Avian Coronavirus Infectious Bronchitis Virus and Human Astrovirus Results in Sequence Insertions.

Keep S, Dowgier G, Lulla V, Britton P, Oade M, Freimanis G J Virol. 2023; 97(3):e0003823.

PMID: 36779761 PMC: 10062133. DOI: 10.1128/jvi.00038-23.


Identification of Amino Acids within Nonstructural Proteins 10 and 14 of the Avian Coronavirus Infectious Bronchitis Virus That Result in Attenuation and .

Keep S, Stevenson-Leggett P, Dowgier G, Everest H, Freimanis G, Oade M J Virol. 2022; 96(6):e0205921.

PMID: 35044208 PMC: 8941869. DOI: 10.1128/jvi.02059-21.


First Genomic Evidence of Dual African Swine Fever Virus Infection: Case Report from Recent and Historical Outbreaks in Sardinia.

Fiori M, Ferretti L, Floris M, Loi F, Di Nardo A, Sechi A Viruses. 2021; 13(11).

PMID: 34834952 PMC: 8618892. DOI: 10.3390/v13112145.


Genomic Diversity and Evolution of Quasispecies in Newcastle Disease Virus Infections.

Jadhav A, Zhao L, Liu W, Ding C, Nair V, Ramos-Onsins S Viruses. 2020; 12(11).

PMID: 33202558 PMC: 7698180. DOI: 10.3390/v12111305.


Patterns of RNA Editing in Newcastle Disease Virus Infections.

Jadhav A, Zhao L, Ledda A, Liu W, Ding C, Nair V Viruses. 2020; 12(11).

PMID: 33147786 PMC: 7693698. DOI: 10.3390/v12111249.


References
1.
Raineri E, Ferretti L, Esteve-Codina A, Nevado B, Heath S, Perez-Enciso M . SNP calling by sequencing pooled samples. BMC Bioinformatics. 2012; 13:239. PMC: 3475117. DOI: 10.1186/1471-2105-13-239. View

2.
Huang W, Li L, Myers J, Marth G . ART: a next-generation sequencing read simulator. Bioinformatics. 2011; 28(4):593-4. PMC: 3278762. DOI: 10.1093/bioinformatics/btr708. View

3.
Dodt M, Roehr J, Ahmed R, Dieterich C . FLEXBAR-Flexible Barcode and Adapter Processing for Next-Generation Sequencing Platforms. Biology (Basel). 2014; 1(3):895-905. PMC: 4009805. DOI: 10.3390/biology1030895. View

4.
Acevedo A, Brodsky L, Andino R . Mutational and fitness landscapes of an RNA virus revealed through population sequencing. Nature. 2013; 505(7485):686-90. PMC: 4111796. DOI: 10.1038/nature12861. View

5.
Koboldt D, Larson D, Wilson R . Using VarScan 2 for Germline Variant Calling and Somatic Mutation Detection. Curr Protoc Bioinformatics. 2015; 44:15.4.1-17. PMC: 4278659. DOI: 10.1002/0471250953.bi1504s44. View