» Articles » PMID: 39594603

GPS-pPLM: A Language Model for Prediction of Prokaryotic Phosphorylation Sites

Overview
Journal Cells
Publisher MDPI
Date 2024 Nov 27
PMID 39594603
Authors
Affiliations
Soon will be listed here.
Abstract

In the prokaryotic kingdom, protein phosphorylation serves as one of the most important posttranslational modifications (PTMs) and is involved in orchestrating a broad spectrum of biological processes. Here, we report an updated online server named the group-based prediction system for prokaryotic phosphorylation language model (GPS-pPLM), used for predicting phosphorylation sites (p-sites) in prokaryotes. For model training, two deep learning methods, a transformer and a deep neural network, were employed, and a total of 10 sequence features and contextual features were integrated. Using 44,839 nonredundant p-sites in 16,041 proteins from 95 prokaryotes, two general models for the prediction of -phosphorylation and -phosphorylation were first pretrained and then fine-tuned to construct 6 predictors specific for each phosphorylatable residue type as well as 134 species-specific predictors. Compared with other existing tools, the GPS-pPLM exhibits higher accuracy in predicting prokaryotic -phosphorylation p-sites. Protein sequences in FASTA format or UniProt accession numbers can be submitted by users, and the predicted results are displayed in tabular form. In addition, we annotate the predicted p-sites with knowledge from 22 public resources, including experimental evidence, 3D structures, and disorder tendencies. The online service of the GPS-pPLM is freely accessible for academic research.

References
1.
Hasan M, Rashid M, Shamima Khatun M, Kurata H . Computational identification of microbial phosphorylation sites by the enhanced characteristics of sequence information. Sci Rep. 2019; 9(1):8258. PMC: 6547684. DOI: 10.1038/s41598-019-44548-x. View

2.
Zeef L, Bosch L, Anborgh P, Cetin R, Parmeggiani A, Hilgenfeld R . Pulvomycin-resistant mutants of E.coli elongation factor Tu. EMBO J. 1994; 13(21):5113-20. PMC: 395458. DOI: 10.1002/j.1460-2075.1994.tb06840.x. View

3.
Tan C . Sequence, structure, and network evolution of protein phosphorylation. Sci Signal. 2011; 4(182):mr6. DOI: 10.1126/scisignal.2002093. View

4.
Macek B, Forchhammer K, Hardouin J, Weber-Ban E, Grangeasse C, Mijakovic I . Protein post-translational modifications in bacteria. Nat Rev Microbiol. 2019; 17(11):651-664. DOI: 10.1038/s41579-019-0243-0. View

5.
Gou Y, Liu D, Chen M, Wei Y, Huang X, Han C . GPS-SUMO 2.0: an updated online service for the prediction of SUMOylation sites and SUMO-interacting motifs. Nucleic Acids Res. 2024; 52(W1):W238-W247. PMC: 11223847. DOI: 10.1093/nar/gkae346. View