» Articles » PMID: 29528364

IFeature: a Python Package and Web Server for Features Extraction and Selection from Protein and Peptide Sequences

Overview
Journal Bioinformatics
Specialty Biology
Date 2018 Mar 13
PMID 29528364
Citations 199
Authors
Affiliations
Soon will be listed here.
Abstract

Summary: Structural and physiochemical descriptors extracted from sequence data have been widely used to represent sequences and predict structural, functional, expression and interaction profiles of proteins and peptides as well as DNAs/RNAs. Here, we present iFeature, a versatile Python-based toolkit for generating various numerical feature representation schemes for both protein and peptide sequences. iFeature is capable of calculating and extracting a comprehensive spectrum of 18 major sequence encoding schemes that encompass 53 different types of feature descriptors. It also allows users to extract specific amino acid properties from the AAindex database. Furthermore, iFeature integrates 12 different types of commonly used feature clustering, selection and dimensionality reduction algorithms, greatly facilitating training, analysis and benchmarking of machine-learning models. The functionality of iFeature is made freely available via an online web server and a stand-alone toolkit.

Availability And Implementation: http://iFeature.erc.monash.edu/; https://github.com/Superzchen/iFeature/.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Citing Articles

Predicting amyloid proteins using attention-based long short-term memory.

Li Z PeerJ Comput Sci. 2025; 11:e2660.

PMID: 40062260 PMC: 11888867. DOI: 10.7717/peerj-cs.2660.


iAMP-CRA: Identifying Antimicrobial Peptides Using Convolutional Recurrent Neural Network with Self-Attention.

Lu J, He Y, Han G, Zeng L Health Inf Sci Syst. 2025; 13(1):25.

PMID: 40062190 PMC: 11883064. DOI: 10.1007/s13755-025-00342-w.


PyPropel: a Python-based tool for efficiently processing and characterising protein data.

Sun J, Ru J, Cribbs A, Xiong D BMC Bioinformatics. 2025; 26(1):70.

PMID: 40025421 PMC: 11871610. DOI: 10.1186/s12859-025-06079-3.


An optimized deep-forest algorithm using a modified differential evolution optimization algorithm: A case of host-pathogen protein-protein interaction prediction.

Emmanuel J, Isewon I, Oyelade J Comput Struct Biotechnol J. 2025; 27:595-611.

PMID: 39995682 PMC: 11849198. DOI: 10.1016/j.csbj.2025.01.020.


APBIO: bioactive profiling of air pollutants through inferred bioactivity signatures and prediction of novel target interactions.

Viesi E, Perricone U, Aloy P, Giugno R J Cheminform. 2025; 17(1):13.

PMID: 39891207 PMC: 11786462. DOI: 10.1186/s13321-025-00961-1.


References
1.
Dubchak I, Muchnik I, Mayor C, Dralyuk I, Kim S . Recognition of a protein fold in the context of the Structural Classification of Proteins (SCOP) classification. Proteins. 1999; 35(4):401-7. View

2.
Chou K . Prediction of protein subcellular locations by incorporating quasi-sequence-order effect. Biochem Biophys Res Commun. 2000; 278(2):477-83. DOI: 10.1006/bbrc.2000.3815. View

3.
Chou K . Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins. 2001; 43(3):246-55. DOI: 10.1002/prot.1035. View

4.
Bhasin M, Raghava G . Classification of nuclear receptors based on amino acid composition and dipeptide composition. J Biol Chem. 2004; 279(22):23262-6. DOI: 10.1074/jbc.M401932200. View

5.
Chou K, Cai Y . Prediction of protein subcellular locations by GO-FunD-PseAA predictor. Biochem Biophys Res Commun. 2004; 320(4):1236-9. DOI: 10.1016/j.bbrc.2004.06.073. View