Predicting Human Protein Subcellular Localization by Heterogeneous and Comprehensive Approaches
Overview
Affiliations
Drug development and investigation of protein function both require an understanding of protein subcellular localization. We developed a system, REALoc, that can predict the subcellular localization of singleplex and multiplex proteins in humans. This system, based on comprehensive strategy, consists of two heterogeneous systematic frameworks that integrate one-to-one and many-to-many machine learning methods and use sequence-based features, including amino acid composition, surface accessibility, weighted sign aa index, and sequence similarity profile, as well as gene ontology function-based features. REALoc can be used to predict localization to six subcellular compartments (cell membrane, cytoplasm, endoplasmic reticulum/Golgi, mitochondrion, nucleus, and extracellular). REALoc yielded a 75.3% absolute true success rate during five-fold cross-validation and a 57.1% absolute true success rate in an independent database test, which was >10% higher than six other prediction systems. Lastly, we analyzed the effects of Vote and GANN models on singleplex and multiplex localization prediction efficacy. REALoc is freely available at http://predictor.nchu.edu.tw/REALoc.
Wattanapornprom W, Thammarongtham C, Hongsthong A, Lertampaiporn S Life (Basel). 2021; 11(4).
PMID: 33808227 PMC: 8066735. DOI: 10.3390/life11040293.
Sahu S, Loaiza C, Kaundal R AoB Plants. 2020; 12(3):plz068.
PMID: 32528639 PMC: 7274489. DOI: 10.1093/aobpla/plz068.
McDermott J, Cort J, Nakayasu E, Pruneda J, Overall C, Adkins J PeerJ. 2019; 7:e7055.
PMID: 31211016 PMC: 6557245. DOI: 10.7717/peerj.7055.