» Articles » PMID: 23836555

VAAST 2.0: Improved Variant Classification and Disease-gene Identification Using a Conservation-controlled Amino Acid Substitution Matrix

Overview
Journal Genet Epidemiol
Specialties Genetics
Public Health
Date 2013 Jul 10
PMID 23836555
Citations 82
Authors
Affiliations
Soon will be listed here.
Abstract

The need for improved algorithmic support for variant prioritization and disease-gene identification in personal genomes data is widely acknowledged. We previously presented the Variant Annotation, Analysis, and Search Tool (VAAST), which employs an aggregative variant association test that combines both amino acid substitution (AAS) and allele frequencies. Here we describe and benchmark VAAST 2.0, which uses a novel conservation-controlled AAS matrix (CASM), to incorporate information about phylogenetic conservation. We show that the CASM approach improves VAAST's variant prioritization accuracy compared to its previous implementation, and compared to SIFT, PolyPhen-2, and MutationTaster. We also show that VAAST 2.0 outperforms KBAC, WSS, SKAT, and variable threshold (VT) using published case-control datasets for Crohn disease (NOD2), hypertriglyceridemia (LPL), and breast cancer (CHEK2). VAAST 2.0 also improves search accuracy on simulated datasets across a wide range of allele frequencies, population-attributable disease risks, and allelic heterogeneity, factors that compromise the accuracies of other aggregative variant association tests. We also demonstrate that, although most aggregative variant association tests are designed for common genetic diseases, these tests can be easily adopted as rare Mendelian disease-gene finders with a simple ranking-by-statistical-significance protocol, and the performance compares very favorably to state-of-art filtering approaches. The latter, despite their popularity, have suboptimal performance especially with the increasing case sample size.

Citing Articles

Isoform-level analyses of 6 cancers uncover extensive genetic risk mechanisms undetected at the gene-level.

Chang Y, Head S, Harrison T, Yu Y, Huff C, Pasaniuc B medRxiv. 2024; .

PMID: 39574839 PMC: 11581093. DOI: 10.1101/2024.10.29.24316388.


Maternal genetic variants in kinesin motor domains prematurely increase egg aneuploidy.

Biswas L, Tyc K, Aboelenain M, Sun S, Dundovic I, Vukusic K Proc Natl Acad Sci U S A. 2024; 121(45):e2414963121.

PMID: 39475646 PMC: 11551467. DOI: 10.1073/pnas.2414963121.


Variant Impact Predictor database (VIPdb), version 2: trends from three decades of genetic variant impact predictors.

Lin Y, Menon A, Hu Z, Brenner S Hum Genomics. 2024; 18(1):90.

PMID: 39198917 PMC: 11360829. DOI: 10.1186/s40246-024-00663-z.


Variant Impact Predictor database (VIPdb), version 2: Trends from 25 years of genetic variant impact predictors.

Lin Y, Menon A, Hu Z, Brenner S bioRxiv. 2024; .

PMID: 38979289 PMC: 11230257. DOI: 10.1101/2024.06.25.600283.


Statistical methods for assessing the effects of de novo variants on birth defects.

Xie Y, Wu R, Li H, Dong W, Zhou G, Zhao H Hum Genomics. 2024; 18(1):25.

PMID: 38486307 PMC: 10938830. DOI: 10.1186/s40246-024-00590-z.


References
1.
Visscher P, Brown M, McCarthy M, Yang J . Five years of GWAS discovery. Am J Hum Genet. 2012; 90(1):7-24. PMC: 3257326. DOI: 10.1016/j.ajhg.2011.11.029. View

2.
Li B, Leal S . Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet. 2008; 83(3):311-21. PMC: 2842185. DOI: 10.1016/j.ajhg.2008.06.024. View

3.
Romeo S, Yin W, Kozlitina J, Pennacchio L, Boerwinkle E, Hobbs H . Rare loss-of-function mutations in ANGPTL family members contribute to plasma triglyceride levels in humans. J Clin Invest. 2008; 119(1):70-9. PMC: 2613476. DOI: 10.1172/JCI37118. View

4.
Abecasis G, Altshuler D, Auton A, Brooks L, Durbin R, Gibbs R . A map of human genome variation from population-scale sequencing. Nature. 2010; 467(7319):1061-73. PMC: 3042601. DOI: 10.1038/nature09534. View

5.
Easton D, Deffenbaugh A, Pruss D, Frye C, Wenstrup R, Allen-Brady K . A systematic genetic assessment of 1,433 sequence variants of unknown clinical significance in the BRCA1 and BRCA2 breast cancer-predisposition genes. Am J Hum Genet. 2007; 81(5):873-83. PMC: 2265654. DOI: 10.1086/521032. View