» Articles » PMID: 36114280

FVC As an Adaptive and Accurate Method for Filtering Variants from Popular NGS Analysis Pipelines

Overview
Journal Commun Biol
Specialty Biology
Date 2022 Sep 16
PMID 36114280
Authors
Affiliations
Soon will be listed here.
Abstract

The quality control of variants from whole-genome sequencing data is vital in clinical diagnosis and human genetics research. However, current filtering methods (Frequency, Hard-Filter, VQSR, GARFIELD, and VEF) were developed to be utilized on particular variant callers and have certain limitations. Especially, the number of eliminated true variants far exceeds the number of removed false variants using these methods. Here, we present an adaptive method for quality control on genetic variants from different analysis pipelines, and validate it on the variants generated from four popular variant callers (GATK HaplotypeCaller, Mutect2, Varscan2, and DeepVariant). FVC consistently exhibited the best performance. It removed far more false variants than the current state-of-the-art filtering methods and recalled ~51-99% true variants filtered out by the other methods. Once trained, FVC can be conveniently integrated into a user-specific variant calling pipeline.

References
1.
Wei Q, Dunbrack Jr R . The role of balanced training and testing data sets for binary classifiers in bioinformatics. PLoS One. 2013; 8(7):e67863. PMC: 3706434. DOI: 10.1371/journal.pone.0067863. View

2.
Ochoa D, Jarnuczak A, Vieitez C, Gehre M, Soucheray M, Mateus A . The functional landscape of the human phosphoproteome. Nat Biotechnol. 2019; 38(3):365-373. PMC: 7100915. DOI: 10.1038/s41587-019-0344-3. View

3.
Poplin R, Chang P, Alexander D, Schwartz S, Colthurst T, Ku A . A universal SNP and small-indel variant caller using deep neural networks. Nat Biotechnol. 2018; 36(10):983-987. DOI: 10.1038/nbt.4235. View

4.
Bentley D, Balasubramanian S, Swerdlow H, Smith G, Milton J, Brown C . Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008; 456(7218):53-9. PMC: 2581791. DOI: 10.1038/nature07517. View

5.
Neums L, Suenaga S, Beyerlein P, Anders S, Koestler D, Mariani A . VaDiR: an integrated approach to Variant Detection in RNA. Gigascience. 2017; 7(2). PMC: 5827345. DOI: 10.1093/gigascience/gix122. View