» Articles » PMID: 38484014

Combining Machine Learning with Structure-based Protein Design to Predict and Engineer Post-translational Modifications of Proteins

Overview
Specialty Biology
Date 2024 Mar 14
PMID 38484014
Authors
Affiliations
Soon will be listed here.
Abstract

Post-translational modifications (PTMs) of proteins play a vital role in their function and stability. These modifications influence protein folding, signaling, protein-protein interactions, enzyme activity, binding affinity, aggregation, degradation, and much more. To date, over 400 types of PTMs have been described, representing chemical diversity well beyond the genetically encoded amino acids. Such modifications pose a challenge to the successful design of proteins, but also represent a major opportunity to diversify the protein engineering toolbox. To this end, we first trained artificial neural networks (ANNs) to predict eighteen of the most abundant PTMs, including protein glycosylation, phosphorylation, methylation, and deamidation. In a second step, these models were implemented inside the computational protein modeling suite Rosetta, which allows flexible combination with existing protocols to model the modified sites and understand their impact on protein stability as well as function. Lastly, we developed a new design protocol that either maximizes or minimizes the predicted probability of a particular site being modified. We find that this combination of ANN prediction and structure-based design can enable the modification of existing, as well as the introduction of novel, PTMs. The potential applications of our work include, but are not limited to, glycan masking of epitopes, strengthening protein-protein interactions through phosphorylation, as well as protecting proteins from deamidation liabilities. These applications are especially important for the design of new protein therapeutics where PTMs can drastically change the therapeutic properties of a protein. Our work adds novel tools to Rosetta's protein engineering toolbox that allow for the rational design of PTMs.

Citing Articles

Self-supervised machine learning methods for protein design improve sampling but not the identification of high-fitness variants.

Ertelt M, Moretti R, Meiler J, Schoeder C Sci Adv. 2025; 11(7):eadr7338.

PMID: 39937901 PMC: 11817935. DOI: 10.1126/sciadv.adr7338.


Artificial Intelligence Transforming Post-Translational Modification Research.

Kim D, Yin T, Zhang T, Im A, Cort J, Rozum J Bioengineering (Basel). 2025; 12(1).

PMID: 39851300 PMC: 11762806. DOI: 10.3390/bioengineering12010026.


DLBWE-Cys: a deep-learning-based tool for identifying cysteine S-carboxyethylation sites using binary-weight encoding.

Luo Z, Wang Q, Xia Y, Zhu X, Yang S, Xu Z Front Genet. 2025; 15():1464976.

PMID: 39845187 PMC: 11751040. DOI: 10.3389/fgene.2024.1464976.


Integrative Multi-PTM Proteomics Reveals Dynamic Global, Redox, Phosphorylation, and Acetylation Regulation in Cytokine-Treated Pancreatic Beta Cells.

Gluth A, Li X, Gritsenko M, Gaffrey M, Kim D, Lalli P Mol Cell Proteomics. 2024; 23(12):100881.

PMID: 39550035 PMC: 11700301. DOI: 10.1016/j.mcpro.2024.100881.


Current computational tools for protein lysine acylation site prediction.

Qin Z, Ren H, Zhao P, Wang K, Liu H, Miao C Brief Bioinform. 2024; 25(6).

PMID: 39316944 PMC: 11421846. DOI: 10.1093/bib/bbae469.


References
1.
Chaudhury S, Lyskov S, Gray J . PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta. Bioinformatics. 2010; 26(5):689-91. PMC: 2828115. DOI: 10.1093/bioinformatics/btq007. View

2.
Fleishman S, Leaver-Fay A, Corn J, Strauch E, Khare S, Koga N . RosettaScripts: a scripting language interface to the Rosetta macromolecular modeling suite. PLoS One. 2011; 6(6):e20161. PMC: 3123292. DOI: 10.1371/journal.pone.0020161. View

3.
Hasegawa M, Hattori K, Kuboniwa H, Kojima T, Orita T, Tomonou K . O-linked sugar chain of human granulocyte colony-stimulating factor protects it against polymerization and denaturation allowing it to retain its biological activity. J Biol Chem. 1990; 265(20):11432-5. View

4.
Sundaram P, Venkatesh R . Retardation of thermal and urea induced inactivation of alpha-chymotrypsin by modification with carbohydrate polymers. Protein Eng. 1998; 11(8):699-705. DOI: 10.1093/protein/11.8.699. View

5.
Robinson N . Protein deamidation. Proc Natl Acad Sci U S A. 2002; 99(8):5283-8. PMC: 122761. DOI: 10.1073/pnas.082102799. View