Bioimaging-based Detection of Mislocalized Proteins in Human Cancers by Semi-supervised Learning
Overview
Authors
Affiliations
Motivation: There is a long-term interest in the challenging task of finding translocated and mislocated cancer biomarker proteins. Bioimages of subcellular protein distribution are new data sources which have attracted much attention in recent years because of their intuitive and detailed descriptions of protein distribution. However, automated methods in large-scale biomarker screening suffer significantly from the lack of subcellular location annotations for bioimages from cancer tissues. The transfer prediction idea of applying models trained on normal tissue proteins to predict the subcellular locations of cancerous ones is arbitrary because the protein distribution patterns may differ in normal and cancerous states.
Results: We developed a new semi-supervised protocol that can use unlabeled cancer protein data in model construction by an iterative and incremental training strategy. Our approach enables us to selectively use the low-quality images in normal states to expand the training sample space and provides a general way for dealing with the small size of annotated images used together with large unannotated ones. Experiments demonstrate that the new semi-supervised protocol can result in improved accuracy and sensitivity of subcellular location difference detection.
Availability And Implementation: The data and code are available at: www.csbio.sjtu.edu.cn/bioinf/SemiBiomarker/.
Supplementary Information: Supplementary data are available at Bioinformatics online.
Rose M, Burgess J, Cheong C, Adams M, Shahrouzi P, OByrne K Front Oncol. 2024; 14:1222698.
PMID: 38720803 PMC: 11076778. DOI: 10.3389/fonc.2024.1222698.
A Review for Artificial Intelligence Based Protein Subcellular Localization.
Xiao H, Zou Y, Wang J, Wan S Biomolecules. 2024; 14(4).
PMID: 38672426 PMC: 11048326. DOI: 10.3390/biom14040409.
Bao L, Luo Z, Zhu X, Xu Y Med Biol Eng Comput. 2023; 62(4):1105-1119.
PMID: 38150111 DOI: 10.1007/s11517-023-02985-x.
Xue Z, Li C, Luo Z, Wang S, Xu Y BMC Bioinformatics. 2022; 23(1):470.
PMID: 36348299 PMC: 9644510. DOI: 10.1186/s12859-022-05015-z.
Protein Subcellular Localization Prediction.
Barberis E, Marengo E, Manfredi M Methods Mol Biol. 2021; 2361:197-212.
PMID: 34236663 DOI: 10.1007/978-1-0716-1641-3_12.