» Articles » PMID: 20299325

Going from Where to Why--interpretable Prediction of Protein Subcellular Localization

Overview
Journal Bioinformatics
Specialty Biology
Date 2010 Mar 20
PMID 20299325
Citations 63
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: Protein subcellular localization is pivotal in understanding a protein's function. Computational prediction of subcellular localization has become a viable alternative to experimental approaches. While current machine learning-based methods yield good prediction accuracy, most of them suffer from two key problems: lack of interpretability and dealing with multiple locations.

Results: We present YLoc, a novel method for predicting protein subcellular localization that addresses these issues. Due to its simple architecture, YLoc can identify the relevant features of a protein sequence contributing to its subcellular localization, e.g. localization signals or motifs relevant to protein sorting. We present several example applications where YLoc identifies the sequence features responsible for protein localization, and thus reveals not only to which location a protein is transported to, but also why it is transported there. YLoc also provides a confidence estimate for the prediction. Thus, the user can decide what level of error is acceptable for a prediction. Due to a probabilistic approach and the use of several thousands of dual-targeted proteins, YLoc is able to predict multiple locations per protein. YLoc was benchmarked using several independent datasets for protein subcellular localization and performs on par with other state-of-the-art predictors. Disregarding low-confidence predictions, YLoc can achieve prediction accuracies of over 90%. Moreover, we show that YLoc is able to reliably predict multiple locations and outperforms the best predictors in this area.

Availability: www.multiloc.org/YLoc.

Citing Articles

SCLpred-ECL: Subcellular Localization Prediction by Deep N-to-1 Convolutional Neural Networks.

Gillani M, Pollastri G Int J Mol Sci. 2024; 25(10).

PMID: 38791479 PMC: 11121631. DOI: 10.3390/ijms25105440.


Protein subcellular localization prediction tools.

Gillani M, Pollastri G Comput Struct Biotechnol J. 2024; 23:1796-1807.

PMID: 38707539 PMC: 11066471. DOI: 10.1016/j.csbj.2024.04.032.


A mutation in CsGME encoding GDP-mannose 3,5-epimerase results in little and wrinkled leaf in cucumber.

Liu M, Li Z, Kang Y, Lv J, Jin Z, Mu S Theor Appl Genet. 2024; 137(5):114.

PMID: 38678513 DOI: 10.1007/s00122-024-04600-5.


gene of unknown function Gohir.A02G161000 encodes a potential transmembrane Root UVB Sensitive 4 Protein with a putative protein-protein interaction interface.

Graffam D, Cutlan M, Storm A, Hulse-Kemp A, Stoeckman A MicroPubl Biol. 2024; 2024.

PMID: 38495582 PMC: 10943365. DOI: 10.17912/micropub.biology.000869.


Regulation of developmental gatekeeping and cell fate transition by the calpain protease DEK1 in Physcomitrium patens.

Demko V, Belova T, Messerer M, Hvidsten T, Perroud P, Ako A Commun Biol. 2024; 7(1):261.

PMID: 38438476 PMC: 10912778. DOI: 10.1038/s42003-024-05933-z.


References
1.
Cokol M, Nair R, Rost B . Finding nuclear localization signals. EMBO Rep. 2001; 1(5):411-5. PMC: 1083765. DOI: 10.1093/embo-reports/kvd092. View

2.
Lin H, Chen C, Sung T, Ho S, Hsu W . Protein subcellular localization prediction of eukaryotes using a knowledge-based approach. BMC Bioinformatics. 2009; 10 Suppl 15:S8. PMC: 2788359. DOI: 10.1186/1471-2105-10-S15-S8. View

3.
Hua S, Sun Z . Support vector machine approach for protein subcellular localization prediction. Bioinformatics. 2001; 17(8):721-8. DOI: 10.1093/bioinformatics/17.8.721. View

4.
Fujiwara Y, Asogawa M . Prediction of subcellular localizations using amino acid composition and order. Genome Inform. 2002; 12:103-12. View

5.
Bannai H, Tamada Y, Maruyama O, Nakai K, Miyano S . Extensive feature detection of N-terminal protein sorting signals. Bioinformatics. 2002; 18(2):298-305. DOI: 10.1093/bioinformatics/18.2.298. View