» Articles » PMID: 34337652

PScL-HDeep: Image-based Prediction of Protein Subcellular Location in Human Tissue Using Ensemble Learning of Handcrafted and Deep Learned Features with Two-layer Feature Selection

Overview
Journal Brief Bioinform
Specialty Biology
Date 2021 Aug 2
PMID 34337652
Citations 15
Authors
Affiliations
Soon will be listed here.
Abstract

Protein subcellular localization plays a crucial role in characterizing the function of proteins and understanding various cellular processes. Therefore, accurate identification of protein subcellular location is an important yet challenging task. Numerous computational methods have been proposed to predict the subcellular location of proteins. However, most existing methods have limited capability in terms of the overall accuracy, time consumption and generalization power. To address these problems, in this study, we developed a novel computational approach based on human protein atlas (HPA) data, referred to as PScL-HDeep, for accurate and efficient image-based prediction of protein subcellular location in human tissues. We extracted different handcrafted and deep learned (by employing pretrained deep learning model) features from different viewpoints of the image. The step-wise discriminant analysis (SDA) algorithm was applied to generate the optimal feature set from each original raw feature set. To further obtain a more informative feature subset, support vector machine-based recursive feature elimination with correlation bias reduction (SVM-RFE + CBR) feature selection algorithm was applied to the integrated feature set. Finally, the classification models, namely support vector machine with radial basis function (SVM-RBF) and support vector machine with linear kernel (SVM-LNR), were learned on the final selected feature set. To evaluate the performance of the proposed method, a new gold standard benchmark training dataset was constructed from the HPA databank. PScL-HDeep achieved the maximum performance on 10-fold cross validation test on this dataset and showed a better efficacy over existing predictors. Furthermore, we also illustrated the generalization ability of the proposed method by conducting a stringent independent validation test.

Citing Articles

TargetCLP: clathrin proteins prediction combining transformed and evolutionary scale modeling-based multi-view features via weighted feature integration approach.

Ullah M, Akbar S, Raza A, Khan K, Zou Q Brief Bioinform. 2025; 26(1.

PMID: 39844339 PMC: 11753890. DOI: 10.1093/bib/bbaf026.


A Review for Artificial Intelligence Based Protein Subcellular Localization.

Xiao H, Zou Y, Wang J, Wan S Biomolecules. 2024; 14(4).

PMID: 38672426 PMC: 11048326. DOI: 10.3390/biom14040409.


Leveraging a meta-learning approach to advance the accuracy of Na blocking peptides prediction.

Shoombuatong W, Homdee N, Schaduangrat N, Chumnanpuen P Sci Rep. 2024; 14(1):4463.

PMID: 38396246 PMC: 10891130. DOI: 10.1038/s41598-024-55160-z.


Dual-Signal Feature Spaces Map Protein Subcellular Locations Based on Immunohistochemistry Image and Protein Sequence.

Zou K, Wang S, Wang Z, Zou H, Yang F Sensors (Basel). 2023; 23(22).

PMID: 38005402 PMC: 10675401. DOI: 10.3390/s23229014.


Empirical comparison and analysis of machine learning-based approaches for druggable protein identification.

Shoombuatong W, Schaduangrat N, Nikom J EXCLI J. 2023; 22:915-927.

PMID: 37780939 PMC: 10539545. DOI: 10.17179/excli2023-6410.


References
1.
Hua S, Sun Z . Support vector machine approach for protein subcellular localization prediction. Bioinformatics. 2001; 17(8):721-8. DOI: 10.1093/bioinformatics/17.8.721. View

2.
Ramaswamy S, Tamayo P, Rifkin R, Mukherjee S, Yeang C, Angelo M . Multiclass cancer diagnosis using tumor gene expression signatures. Proc Natl Acad Sci U S A. 2001; 98(26):15149-54. PMC: 64998. DOI: 10.1073/pnas.211566398. View

3.
Newberg J, Murphy R . A framework for the automated analysis of subcellular patterns in human protein atlas images. J Proteome Res. 2008; 7(6):2300-8. DOI: 10.1021/pr7007626. View

4.
Chou K . An Unprecedented Revolution in Medicinal Chemistry Driven by the Progress of Biological Science. Curr Top Med Chem. 2017; 17(21):2337-2358. DOI: 10.2174/1568026617666170414145508. View

5.
Liu G, Zhang B, Qian G, Wang B, Mao B, Bichindaritz I . Bioimage-Based Prediction of Protein Subcellular Location in Human Tissue with Ensemble Features and Deep Networks. IEEE/ACM Trans Comput Biol Bioinform. 2019; 17(6):1966-1980. DOI: 10.1109/TCBB.2019.2917429. View