» Articles » PMID: 29069344

DEEPre: Sequence-based Enzyme EC Number Prediction by Deep Learning

Overview
Journal Bioinformatics
Specialty Biology
Date 2017 Oct 26
PMID 29069344
Citations 80
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: Annotation of enzyme function has a broad range of applications, such as metagenomics, industrial biotechnology, and diagnosis of enzyme deficiency-caused diseases. However, the time and resource required make it prohibitively expensive to experimentally determine the function of every enzyme. Therefore, computational enzyme function prediction has become increasingly important. In this paper, we develop such an approach, determining the enzyme function by predicting the Enzyme Commission number.

Results: We propose an end-to-end feature selection and classification model training approach, as well as an automatic and robust feature dimensionality uniformization method, DEEPre, in the field of enzyme function prediction. Instead of extracting manually crafted features from enzyme sequences, our model takes the raw sequence encoding as inputs, extracting convolutional and sequential features from the raw encoding based on the classification result to directly improve the prediction performance. The thorough cross-fold validation experiments conducted on two large-scale datasets show that DEEPre improves the prediction performance over the previous state-of-the-art methods. In addition, our server outperforms five other servers in determining the main class of enzymes on a separate low-homology dataset. Two case studies demonstrate DEEPre's ability to capture the functional difference of enzyme isoforms.

Availability And Implementation: The server could be accessed freely at http://www.cbrc.kaust.edu.sa/DEEPre.

Contact: xin.gao@kaust.edu.sa.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Citing Articles

Learning maximally spanning representations improves protein function annotation.

Luo J, Luo Y bioRxiv. 2025; .

PMID: 40027840 PMC: 11870436. DOI: 10.1101/2025.02.13.638156.


Comparative Assessment of Protein Large Language Models for Enzyme Commission Number Prediction.

Capela J, Zimmermann-Kogadeeva M, van Dijk A, de Ridder D, Dias O, Rocha M BMC Bioinformatics. 2025; 26(1):68.

PMID: 40016653 PMC: 11866580. DOI: 10.1186/s12859-025-06081-9.


AtSubP-2.0: An integrated web server for the annotation of Arabidopsis proteome subcellular localization using deep learning.

Duhan N, Kaundal R Plant Genome. 2025; 18(1):e20536.

PMID: 39924294 PMC: 11807733. DOI: 10.1002/tpg2.20536.


SProtFP: a machine learning-based method for functional classification of small ORFs in prokaryotes.

Khanduja A, Mohanty D NAR Genom Bioinform. 2025; 7(1):lqae186.

PMID: 39781515 PMC: 11704790. DOI: 10.1093/nargab/lqae186.


CLAIRE: a contrastive learning-based predictor for EC number of chemical reactions.

Zeng Z, Guo J, Jin J, Luo X J Cheminform. 2025; 17(1):2.

PMID: 39773344 PMC: 11707929. DOI: 10.1186/s13321-024-00944-8.


References
1.
Chou K . Some remarks on protein attribute prediction and pseudo amino acid composition. J Theor Biol. 2010; 273(1):236-47. PMC: 7125570. DOI: 10.1016/j.jtbi.2010.12.024. View

2.
Dai H, Umarov R, Kuwahara H, Li Y, Song L, Gao X . Sequence2Vec: a novel embedding approach for modeling transcription factor binding affinity landscape. Bioinformatics. 2017; 33(22):3575-3583. PMC: 5870668. DOI: 10.1093/bioinformatics/btx480. View

3.
Wang Y, Wang Y, Yang Z, Deng N . Support vector machine prediction of enzyme function with conjoint triad feature and hierarchical context. BMC Syst Biol. 2011; 5 Suppl 1:S6. PMC: 3121122. DOI: 10.1186/1752-0509-5-S1-S6. View

4.
Wang S, Peng J, Ma J, Xu J . Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields. Sci Rep. 2016; 6:18962. PMC: 4707437. DOI: 10.1038/srep18962. View

5.
Fu L, Niu B, Zhu Z, Wu S, Li W . CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012; 28(23):3150-2. PMC: 3516142. DOI: 10.1093/bioinformatics/bts565. View