» Articles » PMID: 33505516

IT3SE-PX: Identification of Bacterial Type III Secreted Effectors Using PSSM Profiles and XGBoost Feature Selection

Overview
Publisher Hindawi
Date 2021 Jan 28
PMID 33505516
Citations 5
Authors
Affiliations
Soon will be listed here.
Abstract

Identification of bacterial type III secreted effectors (T3SEs) has become a popular research topic in the field of bioinformatics due to its crucial role in understanding host-pathogen interaction and developing better therapeutic targets against the pathogens. However, the recognition of all effector proteins by using traditional experimental approaches is often time-consuming and laborious. Therefore, development of computational methods to accurately predict putative novel effectors is important in reducing the number of biological experiments for validation. In this study, we proposed a method, called iT3SE-PX, to identify T3SEs solely based on protein sequences. First, three kinds of features were extracted from the position-specific scoring matrix (PSSM) profiles to help train a machine learning (ML) model. Then, the extreme gradient boosting (XGBoost) algorithm was performed to rank these features based on their classification ability. Finally, the optimal features were selected as inputs to a support vector machine (SVM) classifier to predict T3SEs. Based on the two benchmark datasets, we conducted a 100-time randomized 5-fold cross validation (CV) and an independent test, respectively. The experimental results demonstrated that the proposed method achieved superior performance compared to most of the existing methods and could serve as a useful tool for identifying putative T3SEs, given only the sequence information.

Citing Articles

Predicting the Response of Patients Treated with Lu-DOTATATE Using Single-photon Emission Computed Tomography-Computed Tomography Image-based Radiomics and Clinical Features.

Behmanesh B, Abdi-Saray A, Deevband M, Amoui M, Haghighatkhah H, Shalbaf A J Med Signals Sens. 2024; 14:28.

PMID: 39600984 PMC: 11592923. DOI: 10.4103/jmss.jmss_54_23.


Natural language processing approach to model the secretion signal of type III effectors.

Wagner N, Alburquerque M, Ecker N, Dotan E, Zerah B, Pena M Front Plant Sci. 2022; 13:1024405.

PMID: 36388586 PMC: 9659976. DOI: 10.3389/fpls.2022.1024405.


PreAcrs: a machine learning framework for identifying anti-CRISPR proteins.

Zhu L, Wang X, Li F, Song J BMC Bioinformatics. 2022; 23(1):444.

PMID: 36284264 PMC: 9597991. DOI: 10.1186/s12859-022-04986-3.


DeepT3 2.0: improving type III secreted effector predictions by an integrative deep learning framework.

Jing R, Wen T, Liao C, Xue L, Liu F, Yu L NAR Genom Bioinform. 2021; 3(4):lqab086.

PMID: 34617013 PMC: 8489581. DOI: 10.1093/nargab/lqab086.


Accurate Identification of Antioxidant Proteins Based on a Combination of Machine Learning Techniques and Hidden Markov Model Profiles.

Shen Z, Liu T, Xu T Comput Math Methods Med. 2021; 2021:5770981.

PMID: 34413898 PMC: 8369162. DOI: 10.1155/2021/5770981.

References
1.
Wang J, Li J, Yang B, Xie R, Marquez-Lago T, Leier A . Bastion3: a two-layer ensemble predictor of type III secreted effectors. Bioinformatics. 2018; 35(12):2017-2028. PMC: 7963071. DOI: 10.1093/bioinformatics/bty914. View

2.
Lv Z, Jin S, Ding H, Zou Q . A Random Forest Sub-Golgi Protein Classifier Optimized via Dipeptide and Amino Acid Composition Features. Front Bioeng Biotechnol. 2019; 7:215. PMC: 6737778. DOI: 10.3389/fbioe.2019.00215. View

3.
Karavolos M, Roe A, Wilson M, Henderson J, Lee J, Gally D . Type III secretion of the Salmonella effector protein SopE is mediated via an N-terminal amino acid signal and not an mRNA sequence. J Bacteriol. 2005; 187(5):1559-67. PMC: 1064012. DOI: 10.1128/JB.187.5.1559-1567.2005. View

4.
Goldberg T, Rost B, Bromberg Y . Computational prediction shines light on type III secretion origins. Sci Rep. 2016; 6:34516. PMC: 5054392. DOI: 10.1038/srep34516. View

5.
Dong X, Zhang Y, Zhang Z . Using weakly conserved motifs hidden in secretion signals to identify type-III effectors from bacterial pathogen genomes. PLoS One. 2013; 8(2):e56632. PMC: 3577856. DOI: 10.1371/journal.pone.0056632. View