» Articles » PMID: 36147670

IBPred: A Sequence-based Predictor for Identifying Ion Binding Protein in Phage

Overview
Specialty Biotechnology
Date 2022 Sep 23
PMID 36147670
Authors
Affiliations
Soon will be listed here.
Abstract

Ion binding proteins (IBPs) can selectively and non-covalently interact with ions. IBPs in phages also play an important role in biological processes. Therefore, accurate identification of IBPs is necessary for understanding their biological functions and molecular mechanisms that involve binding to ions. Since molecular biology experimental methods are still labor-intensive and cost-ineffective in identifying IBPs, it is helpful to develop computational methods to identify IBPs quickly and efficiently. In this work, a random forest (RF)-based model was constructed to quickly identify IBPs. Based on the protein sequence information and residues' physicochemical properties, the dipeptide composition combined with the physicochemical correlation between two residues were proposed for the extraction of features. A feature selection technique called analysis of variance (ANOVA) was used to exclude redundant information. By comparing with other classified methods, we demonstrated that our method could identify IBPs accurately. Based on the model, a Python package named IBPred was built with the source code which can be accessed at https://github.com/ShishiYuan/IBPred.

Citing Articles

TCellPredX: A Novel Approach for Accurate Prediction of Hepatitis C Virus Linear T Cell Epitopes.

Ge F, Li H, Zhang M, Arif M, Alam T ACS Omega. 2025; 9(52):51494-51507.

PMID: 39758636 PMC: 11696426. DOI: 10.1021/acsomega.4c08715.


ac4C-AFL: A high-precision identification of human mRNA N4-acetylcytidine sites based on adaptive feature representation learning.

Pham N, Terrance A, Jeon Y, Rakkiyappan R, Manavalan B Mol Ther Nucleic Acids. 2024; 35(2):102192.

PMID: 38779332 PMC: 11108997. DOI: 10.1016/j.omtn.2024.102192.


Accurately identifying hemagglutinin using sequence information and machine learning methods.

Zou X, Ren L, Cai P, Zhang Y, Ding H, Deng K Front Med (Lausanne). 2023; 10:1281880.

PMID: 38020152 PMC: 10644030. DOI: 10.3389/fmed.2023.1281880.


A First Computational Frame for Recognizing Heparin-Binding Protein.

Zhu W, Yuan S, Li J, Huang C, Lin H, Liao B Diagnostics (Basel). 2023; 13(14).

PMID: 37510209 PMC: 10377868. DOI: 10.3390/diagnostics13142465.


Antimicrobial Peptides Prediction method based on sequence multidimensional feature embedding.

Dong B, Li M, Jiang B, Gao B, Li D, Zhang T Front Genet. 2022; 13:1069558.

PMID: 36468005 PMC: 9714691. DOI: 10.3389/fgene.2022.1069558.

References
1.
Fu L, Niu B, Zhu Z, Wu S, Li W . CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012; 28(23):3150-2. PMC: 3516142. DOI: 10.1093/bioinformatics/bts565. View

2.
Muller-Xing R, Ardiansyah R, Xing Q, Faivre L, Tian J, Wang G . Polycomb proteins control floral determinacy by H3K27me3-mediated repression of pluripotency genes in Arabidopsis thaliana. J Exp Bot. 2022; 73(8):2385-2402. DOI: 10.1093/jxb/erac013. View

3.
Li F, Chen J, Ge Z, Wen Y, Yue Y, Hayashida M . Computational prediction and interpretation of both general and specific types of promoters in Escherichia coli by exploiting a stacked ensemble-learning framework. Brief Bioinform. 2020; 22(2):2126-2140. PMC: 7986616. DOI: 10.1093/bib/bbaa049. View

4.
Zheng L, Huang S, Mu N, Zhang H, Zhang J, Chang Y . RAACBook: a web server of reduced amino acid alphabet for sequence-dependent inference by using Chou's five-step rule. Database (Oxford). 2019; 2019. PMC: 6893003. DOI: 10.1093/database/baz131. View

5.
Han Y, Yang H, Huang Q, Sun Z, Li M, Zhang J . Risk prediction of diabetes and pre-diabetes based on physical examination data. Math Biosci Eng. 2022; 19(4):3597-3608. DOI: 10.3934/mbe.2022166. View