» Articles » PMID: 37443362

Discovering Functionally Important Sites in Proteins

Overview
Journal Nat Commun
Specialty Biology
Date 2023 Jul 14
PMID 37443362
Authors
Affiliations
Soon will be listed here.
Abstract

Proteins play important roles in biology, biotechnology and pharmacology, and missense variants are a common cause of disease. Discovering functionally important sites in proteins is a central but difficult problem because of the lack of large, systematic data sets. Sequence conservation can highlight residues that are functionally important but is often convoluted with a signal for preserving structural stability. We here present a machine learning method to predict functional sites by combining statistical models for protein sequences with biophysical models of stability. We train the model using multiplexed experimental data on variant effects and validate it broadly. We show how the model can be used to discover active sites, as well as regulatory and binding sites. We illustrate the utility of the model by prospective prediction and subsequent experimental validation on the functional consequences of missense variants in HPRT1 which may cause Lesch-Nyhan syndrome, and pinpoint the molecular mechanisms by which they cause disease.

Citing Articles

Rewiring protein sequence and structure generative models to enhance protein stability prediction.

Li Z, Luo Y bioRxiv. 2025; .

PMID: 40027759 PMC: 11870403. DOI: 10.1101/2025.02.13.638154.


Decoding biomolecular condensate dynamics: an energy landscape approach.

Biswas S, Potoyan D PLoS Comput Biol. 2025; 21(2):e1012826.

PMID: 39928699 PMC: 11841893. DOI: 10.1371/journal.pcbi.1012826.


Deep Learning Approaches for the Prediction of Protein Functional Sites.

Pitarch B, Pazos F Molecules. 2025; 30(2).

PMID: 39860084 PMC: 11767512. DOI: 10.3390/molecules30020214.


DPFunc: accurately predicting protein function via deep learning with domain-guided structure information.

Wang W, Shuai Y, Zeng M, Fan W, Li M Nat Commun. 2025; 16(1):70.

PMID: 39746897 PMC: 11697396. DOI: 10.1038/s41467-024-54816-8.


Expert-guided protein language models enable accurate and blazingly fast fitness prediction.

Marquet C, Schlensok J, Abakarova M, Rost B, Laine E Bioinformatics. 2024; 40(11).

PMID: 39576695 PMC: 11588025. DOI: 10.1093/bioinformatics/btae621.


References
1.
Sol A, del Sol Mesa A, Pazos F, Valencia A . Automatic methods for predicting functionally important residues. J Mol Biol. 2003; 326(4):1289-302. DOI: 10.1016/s0022-2836(02)01451-1. View

2.
Faure A, Domingo J, Schmiedel J, Hidalgo-Carcedo C, Diss G, Lehner B . Mapping the energetic and allosteric landscapes of protein binding domains. Nature. 2022; 604(7904):175-183. DOI: 10.1038/s41586-022-04586-4. View

3.
Livesey B, Marsh J . Using deep mutational scanning to benchmark variant effect predictors and identify disease mutations. Mol Syst Biol. 2020; 16(7):e9380. PMC: 7336272. DOI: 10.15252/msb.20199380. View

4.
Kulmanov M, Khan M, Hoehndorf R, Wren J . DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier. Bioinformatics. 2017; 34(4):660-668. PMC: 5860606. DOI: 10.1093/bioinformatics/btx624. View

5.
Cheng G, Qian B, Samudrala R, Baker D . Improvement in protein functional site prediction by distinguishing structural and functional constraints on protein family evolution using computational design. Nucleic Acids Res. 2005; 33(18):5861-7. PMC: 1258172. DOI: 10.1093/nar/gki894. View