» Articles » PMID: 37863385

A Review of Machine Learning and Algorithmic Methods for Protein Phosphorylation Site Prediction

Overview
Specialty Biology
Date 2023 Oct 20
PMID 37863385
Authors
Affiliations
Soon will be listed here.
Abstract

Post-translational modifications (PTMs) have key roles in extending the functional diversity of proteins and, as a result, regulating diverse cellular processes in prokaryotic and eukaryotic organisms. Phosphorylation modification is a vital PTM that occurs in most proteins and plays a significant role in many biological processes. Disorders in the phosphorylation process lead to multiple diseases, including neurological disorders and cancers. The purpose of this review is to organize this body of knowledge associated with phosphorylation site (p-site) prediction to facilitate future research in this field. At first, we comprehensively review all related databases and introduce all steps regarding dataset creation, data preprocessing, and method evaluation in p-site prediction. Next, we investigate p-site prediction methods, which are divided into two computational groups: algorithmic and machine learning (ML). Additionally, it is shown that there are basically two main approaches for p-site prediction by ML: conventional and end-to-end deep learning methods, both of which are given an overview. Moreover, this review introduces the most important feature extraction techniques, which have mostly been used in p-site prediction. Finally, we create three test sets from new proteins related to the released version of the database of protein post-translational modifications (dbPTM) in 2022 based on general and human species. Evaluating online p-site prediction tools on newly added proteins introduced in the dbPTM 2022 release, distinct from those in the dbPTM 2019 release, reveals their limitations. In other words, the actual performance of these online p-site prediction tools on unseen proteins is notably lower than the results reported in their respective research papers.

Citing Articles

MMFuncPhos: A Multi-Modal Learning Framework for Identifying Functional Phosphorylation Sites and Their Regulatory Types.

Xie J, Dong R, Zhu J, Lin H, Wang S, Lai L Adv Sci (Weinh). 2025; 12(9):e2410981.

PMID: 39804866 PMC: 11884596. DOI: 10.1002/advs.202410981.


Current computational tools for protein lysine acylation site prediction.

Qin Z, Ren H, Zhao P, Wang K, Liu H, Miao C Brief Bioinform. 2024; 25(6).

PMID: 39316944 PMC: 11421846. DOI: 10.1093/bib/bbae469.


In Silico Analysis of the Missense Variants of Uncertain Significance of Gene Reported in GnomAD Database.

Caballero-Avendano A, Gutierrez-Angulo M, Ayala-Madrigal M, Moreno-Ortiz J, Gonzalez-Mercado A, Peregrina-Sandoval J Genes (Basel). 2024; 15(8).

PMID: 39202333 PMC: 11353749. DOI: 10.3390/genes15080972.


Protein modification in neurodegenerative diseases.

Ramazi S, Dadzadi M, Darvazi M, Seddigh N, Allahverdi A MedComm (2020). 2024; 5(8):e674.

PMID: 39105197 PMC: 11298556. DOI: 10.1002/mco2.674.


Interaction of Soybean ( (L.) ) Class II ACBPs with MPK2 and SAPK2 Kinases: New Insights into the Regulatory Mechanisms of Plant ACBPs.

Moradi A, Lung S, Chye M Plants (Basel). 2024; 13(8).

PMID: 38674555 PMC: 11055065. DOI: 10.3390/plants13081146.


References
1.
Craveur P, Rebehmed J, de Brevern A . PTM-SD: a database of structurally resolved and annotated posttranslational modifications in proteins. Database (Oxford). 2014; 2014. PMC: 4038255. DOI: 10.1093/database/bau041. View

2.
Popovic D, Vucic D, Dikic I . Ubiquitination in disease pathogenesis and treatment. Nat Med. 2014; 20(11):1242-53. DOI: 10.1038/nm.3739. View

3.
Karve T, Cheema A . Small changes huge impact: the role of protein posttranslational modifications in cellular homeostasis and disease. J Amino Acids. 2012; 2011:207691. PMC: 3268018. DOI: 10.4061/2011/207691. View

4.
Krassowski M, Pellegrina D, Mee M, Fradet-Turcotte A, Bhat M, Reimand J . ActiveDriverDB: Interpreting Genetic Variation in Human and Cancer Genomes Using Post-translational Modification Sites and Signaling Networks (2021 Update). Front Cell Dev Biol. 2021; 9:626821. PMC: 8021862. DOI: 10.3389/fcell.2021.626821. View

5.
Jamal S, Ali W, Nagpal P, Grover A, Grover S . Predicting phosphorylation sites using machine learning by integrating the sequence, structure, and functional information of proteins. J Transl Med. 2021; 19(1):218. PMC: 8142496. DOI: 10.1186/s12967-021-02851-0. View