» Articles » PMID: 38074467

Accurate Prediction of Functional Effect of Single Amino Acid Variants with Deep Learning

Overview
Specialty Biotechnology
Date 2023 Dec 11
PMID 38074467
Authors
Affiliations
Soon will be listed here.
Abstract

The assessment of functional effect of amino acid variants is a critical biological problem in proteomics for clinical medicine and protein engineering. Although natively occurring variants offer insights into deleterious variants, high-throughput deep mutational experiments enable comprehensive investigation of amino acid variants for a given protein. However, these mutational experiments are too expensive to dissect millions of variants on thousands of proteins. Thus, computational approaches have been proposed, but they heavily rely on hand-crafted evolutionary conservation, limiting their accuracy. Recent advancement in transformers provides a promising solution to precisely estimate the functional effects of protein variants on high-throughput experimental data. Here, we introduce a novel deep learning model, namely Rep2Mut-V2, which leverages learned representation from transformer models. Rep2Mut-V2 significantly enhances the prediction accuracy for 27 types of measurements of functional effects of protein variants. In the evaluation of 38 protein datasets with 118,933 single amino acid variants, Rep2Mut-V2 achieved an average Spearman's correlation coefficient of 0.7. This surpasses the performance of six state-of-the-art methods, including the recently released methods ESM, DeepSequence and EVE. Even with limited training data, Rep2Mut-V2 outperforms ESM and DeepSequence, showing its potential to extend high-throughput experimental analysis for more protein variants to reduce experimental cost. In conclusion, Rep2Mut-V2 provides accurate predictions of the functional effects of single amino acid variants of protein coding sequences. This tool can significantly aid in the interpretation of variants in human disease studies.

Citing Articles

MMRT: MultiMut Recursive Tree for predicting functional effects of high-order protein variants from low-order variants.

Forrest B, Derbel H, Zhao Z, Liu Q Comput Struct Biotechnol J. 2025; 27:672-681.

PMID: 40070521 PMC: 11894328. DOI: 10.1016/j.csbj.2025.02.012.


Intelligent biology and medicine: Accelerating innovative computational approaches.

Li F, Liu L, Wang K, Liu X, Zhao Z Comput Struct Biotechnol J. 2025; 27():32-34.

PMID: 39790120 PMC: 11714061. DOI: 10.1016/j.csbj.2024.11.044.


Variant Impact Predictor database (VIPdb), version 2: trends from three decades of genetic variant impact predictors.

Lin Y, Menon A, Hu Z, Brenner S Hum Genomics. 2024; 18(1):90.

PMID: 39198917 PMC: 11360829. DOI: 10.1186/s40246-024-00663-z.


Pharmacogenomics: A Genetic Approach to Drug Development and Therapy.

Qahwaji R, Ashankyty I, Sannan N, Hazzazi M, Basabrain A, Mobashir M Pharmaceuticals (Basel). 2024; 17(7).

PMID: 39065790 PMC: 11279827. DOI: 10.3390/ph17070940.


Variant Impact Predictor database (VIPdb), version 2: Trends from 25 years of genetic variant impact predictors.

Lin Y, Menon A, Hu Z, Brenner S bioRxiv. 2024; .

PMID: 38979289 PMC: 11230257. DOI: 10.1101/2024.06.25.600283.

References
1.
Firnberg E, Labonte J, Gray J, Ostermeier M . A comprehensive, high-resolution map of a gene's fitness landscape. Mol Biol Evol. 2014; 31(6):1581-92. PMC: 4032126. DOI: 10.1093/molbev/msu081. View

2.
Suzek B, Huang H, McGarvey P, Mazumder R, Wu C . UniRef: comprehensive and non-redundant UniProt reference clusters. Bioinformatics. 2007; 23(10):1282-8. DOI: 10.1093/bioinformatics/btm098. View

3.
Ng P, Henikoff S . SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003; 31(13):3812-4. PMC: 168916. DOI: 10.1093/nar/gkg509. View

4.
Starita L, Pruneda J, Lo R, Fowler D, Kim H, Hiatt J . Activity-enhancing mutations in an E3 ubiquitin ligase identified by high-throughput mutagenesis. Proc Natl Acad Sci U S A. 2013; 110(14):E1263-72. PMC: 3619334. DOI: 10.1073/pnas.1303309110. View

5.
Wu N, Olson C, Du Y, Le S, Tran K, Remenyi R . Functional Constraint Profiling of a Viral Protein Reveals Discordance of Evolutionary Conservation and Functionality. PLoS Genet. 2015; 11(7):e1005310. PMC: 4489113. DOI: 10.1371/journal.pgen.1005310. View