» Articles » PMID: 39860084

Deep Learning Approaches for the Prediction of Protein Functional Sites

Overview
Journal Molecules
Publisher MDPI
Date 2025 Jan 25
PMID 39860084
Authors
Affiliations
Soon will be listed here.
Abstract

Knowing which residues of a protein are important for its function is of paramount importance for understanding the molecular basis of this function and devising ways of modifying it for medical or biotechnological applications. Due to the difficulty in detecting these residues experimentally, prediction methods are essential to cope with the sequence deluge that is filling databases with uncharacterized protein sequences. Deep learning approaches are especially well suited for this task due to the large amounts of protein sequences for training them, the trivial codification of this sequence data to feed into these systems, and the intrinsic sequential nature of the data that makes them suitable for language models. As a consequence, deep learning-based approaches are being applied to the prediction of different types of functional sites and regions in proteins. This review aims to give an overview of the current landscape of methodologies so that interested users can have an idea of which kind of approaches are available for their proteins of interest. We also try to give an idea of how these systems work, as well as explain their limitations and high dependence on the training set so that users are aware of the quality of expected results.

References
1.
Boadu F, Lee A, Cheng J . Deep learning methods for protein function prediction. Proteomics. 2024; 25(1-2):e2300471. PMC: 11735672. DOI: 10.1002/pmic.202300471. View

2.
Bernhofer M, Rost B . TMbed: transmembrane proteins predicted through language model embeddings. BMC Bioinformatics. 2022; 23(1):326. PMC: 9358067. DOI: 10.1186/s12859-022-04873-x. View

3.
Zhou Y, Myung Y, Rodrigues C, Ascher D . DDMut-PPI: predicting effects of mutations on protein-protein interactions using graph-based deep learning. Nucleic Acids Res. 2024; 52(W1):W207-W214. PMC: 11223791. DOI: 10.1093/nar/gkae412. View

4.
Zhou N, Jiang Y, Bergquist T, Lee A, Kacsoh B, Crocker A . The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens. Genome Biol. 2019; 20(1):244. PMC: 6864930. DOI: 10.1186/s13059-019-1835-8. View

5.
Sigrist C, de Castro E, Cerutti L, Cuche B, Hulo N, Bridge A . New and continuing developments at PROSITE. Nucleic Acids Res. 2012; 41(Database issue):D344-7. PMC: 3531220. DOI: 10.1093/nar/gks1067. View