» Articles » PMID: 24244144

Structure-based Function Prediction of Uncharacterized Protein Using Binding Sites Comparison

Overview
Specialty Biology
Date 2013 Nov 19
PMID 24244144
Citations 16
Authors
Affiliations
Soon will be listed here.
Abstract

A challenge in structural genomics is prediction of the function of uncharacterized proteins. When proteins cannot be related to other proteins of known activity, identification of function based on sequence or structural homology is impossible and in such cases it would be useful to assess structurally conserved binding sites in connection with the protein's function. In this paper, we propose the function of a protein of unknown activity, the Tm1631 protein from Thermotoga maritima, by comparing its predicted binding site to a library containing thousands of candidate structures. The comparison revealed numerous similarities with nucleotide binding sites including specifically, a DNA-binding site of endonuclease IV. We constructed a model of this Tm1631 protein with a DNA-ligand from the newly found similar binding site using ProBiS, and validated this model by molecular dynamics. The interactions predicted by the Tm1631-DNA model corresponded to those known to be important in endonuclease IV-DNA complex model and the corresponding binding free energies, calculated from these models were in close agreement. We thus propose that Tm1631 is a DNA binding enzyme with endonuclease activity that recognizes DNA lesions in which at least two consecutive nucleotides are unpaired. Our approach is general, and can be applied to any protein of unknown function. It might also be useful to guide experimental determination of function of uncharacterized proteins.

Citing Articles

The protein interactome of Escherichia coli carbohydrate metabolism.

Chowdhury S, Fong S, Uetz P PLoS One. 2025; 20(2):e0315240.

PMID: 39903745 PMC: 11793828. DOI: 10.1371/journal.pone.0315240.


Learning a generalized graph transformer for protein function prediction in dissimilar sequences.

Fu Y, Gu Z, Luo X, Guo Q, Lai L, Deng M Gigascience. 2024; 13.

PMID: 39657158 PMC: 11734293. DOI: 10.1093/gigascience/giae093.


EGPDI: identifying protein-DNA binding sites based on multi-view graph embedding fusion.

Zheng M, Sun G, Li X, Fan Y Brief Bioinform. 2024; 25(4).

PMID: 38975896 PMC: 11229037. DOI: 10.1093/bib/bbae330.


EquiPNAS: improved protein-nucleic acid binding site prediction using protein-language-model-informed equivariant deep graph neural networks.

Roche R, Moussad B, Shuvo M, Tarafder S, Bhattacharya D Nucleic Acids Res. 2024; 52(5):e27.

PMID: 38281252 PMC: 10954458. DOI: 10.1093/nar/gkae039.


Gene Ontology Capsule GAN: an improved architecture for protein function prediction.

Mansoor M, Nauman M, Ur Rehman H, Omar M PeerJ Comput Sci. 2022; 8:e1014.

PMID: 36092003 PMC: 9454774. DOI: 10.7717/peerj-cs.1014.


References
1.
Murzin A, Brenner S, Hubbard T, Chothia C . SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol. 1995; 247(4):536-40. DOI: 10.1006/jmbi.1995.0159. View

2.
Xie L, Bourne P . Detecting evolutionary relationships across existing fold space, using sequence order-independent profile-profile alignments. Proc Natl Acad Sci U S A. 2008; 105(14):5441-6. PMC: 2291117. DOI: 10.1073/pnas.0704422105. View

3.
Konc J, Depolli M, Trobec R, Rozman K, Janezic D . Parallel-ProBiS: fast parallel algorithm for local structural comparison of protein structures and binding sites. J Comput Chem. 2012; 33(27):2199-203. DOI: 10.1002/jcc.23048. View

4.
Rosamond J, Allsop A . Harnessing the power of the genome in the search for new antibiotics. Science. 2000; 287(5460):1973-6. DOI: 10.1126/science.287.5460.1973. View

5.
Borstnik U, Hodoscek M, Janezic D . Improving the performance of molecular dynamics simulations on parallel clusters. J Chem Inf Comput Sci. 2004; 44(2):359-64. DOI: 10.1021/ci034261e. View