» Articles » PMID: 30742610

How Good Are Pathogenicity Predictors in Detecting Benign Variants?

Overview
Specialty Biology
Date 2019 Feb 12
PMID 30742610
Citations 53
Authors
Affiliations
Soon will be listed here.
Abstract

Computational tools are widely used for interpreting variants detected in sequencing projects. The choice of these tools is critical for reliable variant impact interpretation for precision medicine and should be based on systematic performance assessment. The performance of the methods varies widely in different performance assessments, for example due to the contents and sizes of test datasets. To address this issue, we obtained 63,160 common amino acid substitutions (allele frequency ≥1% and <25%) from the Exome Aggregation Consortium (ExAC) database, which contains variants from 60,706 genomes or exomes. We evaluated the specificity, the capability to detect benign variants, for 10 variant interpretation tools. In addition to overall specificity of the tools, we tested their performance for variants in six geographical populations. PON-P2 had the best performance (95.5%) followed by FATHMM (86.4%) and VEST (83.5%). While these tools had excellent performance, the poorest method predicted more than one third of the benign variants to be disease-causing. The results allow choosing reliable methods for benign variant interpretation, for both research and clinical purposes, as well as provide a benchmark for method developers.

Citing Articles

There will always be variants of uncertain significance. Analysis of VUSs.

Zhang H, Kabir M, Ahmed S, Vihinen M NAR Genom Bioinform. 2024; 6(4):lqae154.

PMID: 39633727 PMC: 11616676. DOI: 10.1093/nargab/lqae154.


Whole-exome sequencing and Drosophila modelling reveal mutated genes and pathways contributing to human ovarian failure.

Henarejos-Castillo I, Sanz F, Solana-Manrique C, Sebastian-Leon P, Medina I, Remohi J Reprod Biol Endocrinol. 2024; 22(1):153.

PMID: 39633407 PMC: 11616368. DOI: 10.1186/s12958-024-01325-4.


AI-derived comparative assessment of the performance of pathogenicity prediction tools on missense variants of breast cancer genes.

Ahmad R, Ali B, Al-Jasmi F, Al Dhaheri N, Al Turki S, Kizhakkedath P Hum Genomics. 2024; 18(1):99.

PMID: 39256852 PMC: 11389290. DOI: 10.1186/s40246-024-00667-9.


No evidence that ACE2 or TMPRSS2 drive population disparity in COVID risks.

Pearson N, Novembre J BMC Med. 2024; 22(1):337.

PMID: 39183295 PMC: 11346279. DOI: 10.1186/s12916-024-03539-0.


Sequence-to-expression approach to identify etiological non-coding DNA variations in P53 and cMYC-driven diseases.

Kin K, Bhogale S, Zhu L, Thomas D, Bertol J, Zheng W Hum Mol Genet. 2024; 33(19):1697-1710.

PMID: 39017605 PMC: 11413647. DOI: 10.1093/hmg/ddae109.


References
1.
Sherry S, Ward M, Kholodov M, Baker J, Phan L, Smigielski E . dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2000; 29(1):308-11. PMC: 29783. DOI: 10.1093/nar/29.1.308. View

2.
Ng P, Henikoff S . SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003; 31(13):3812-4. PMC: 168916. DOI: 10.1093/nar/gkg509. View

3.
Reva B, Antipin Y, Sander C . Determinants of protein function revealed by combinatorial entropy optimization. Genome Biol. 2007; 8(11):R232. PMC: 2258190. DOI: 10.1186/gb-2007-8-11-r232. View

4.
Laurila K, Vihinen M . Prediction of disease-related mutations affecting protein localization. BMC Genomics. 2009; 10:122. PMC: 2680896. DOI: 10.1186/1471-2164-10-122. View

5.
Potapov V, Cohen M, Schreiber G . Assessing computational methods for predicting protein stability upon mutation: good on average but not in the details. Protein Eng Des Sel. 2009; 22(9):553-60. DOI: 10.1093/protein/gzp030. View