Computational Prediction of MoRFs Based on Protein Sequences and Minimax Probability Machine
Overview
Authors
Affiliations
Background: Molecular recognition features (MoRFs) are one important type of disordered segments that can promote specific protein-protein interactions. They are located within longer intrinsically disordered regions (IDRs), and undergo disorder-to-order transitions upon binding to their interaction partners. The functional importance of MoRFs and the limitation of experimental identification make it necessary to predict MoRFs accurately with computational methods.
Results: In this study, a new sequence-based method, named as MoRF, is proposed for predicting MoRFs. MoRF uses minimax probability machine (MPM) to predict MoRFs based on 16 features and 3 different windows, which neither relying on other predictors nor calculating the properties of the surrounding regions of MoRFs separately. Comparing with ANCHOR, MoRFpred and MoRF on the same test sets, MoRF not only obtains higher AUC, but also obtains higher TPR at low FPR.
Conclusions: The features used in MoRF can effectively predict MoRFs, especially after preprocessing. Besides, MoRF uses a linear classification algorithm and does not rely on results of other predictors which makes it accessible and repeatable.
Computational prediction of disordered binding regions.
Basu S, Kihara D, Kurgan L Comput Struct Biotechnol J. 2023; 21:1487-1497.
PMID: 36851914 PMC: 9957716. DOI: 10.1016/j.csbj.2023.02.018.
Predicting Protein Conformational Disorder and Disordered Binding Sites.
Tamburrini K, Pesce G, Nilsson J, Gondelaud F, Kajava A, Berrin J Methods Mol Biol. 2022; 2449:95-147.
PMID: 35507260 DOI: 10.1007/978-1-0716-2095-3_4.
Intrinsically disordered proteins play diverse roles in cell signaling.
Bondos S, Dunker A, Uversky V Cell Commun Signal. 2022; 20(1):20.
PMID: 35177069 PMC: 8851865. DOI: 10.1186/s12964-022-00821-7.
Prediction of MoRFs based on sequence properties and convolutional neural networks.
He H, Zhou Y, Chi Y, He J BioData Min. 2021; 14(1):39.
PMID: 34391457 PMC: 8364704. DOI: 10.1186/s13040-021-00275-6.
Computational predictions for protein sequences of COVID-19 virus via machine learning algorithms.
Afify H, Zanaty M Med Biol Eng Comput. 2021; 59(9):1723-1734.
PMID: 34291385 PMC: 8295007. DOI: 10.1007/s11517-021-02412-z.