» Articles » PMID: 24792350

Predicting DNA-binding Proteins and Binding Residues by Complex Structure Prediction and Application to Human Proteome

Overview
Journal PLoS One
Date 2014 May 6
PMID 24792350
Citations 9
Authors
Affiliations
Soon will be listed here.
Abstract

As more and more protein sequences are uncovered from increasingly inexpensive sequencing techniques, an urgent task is to find their functions. This work presents a highly reliable computational technique for predicting DNA-binding function at the level of protein-DNA complex structures, rather than low-resolution two-state prediction of DNA-binding as most existing techniques do. The method first predicts protein-DNA complex structure by utilizing the template-based structure prediction technique HHblits, followed by binding affinity prediction based on a knowledge-based energy function (Distance-scaled finite ideal-gas reference state for protein-DNA interactions). A leave-one-out cross validation of the method based on 179 DNA-binding and 3797 non-binding protein domains achieves a Matthews correlation coefficient (MCC) of 0.77 with high precision (94%) and high sensitivity (65%). We further found 51% sensitivity for 82 newly determined structures of DNA-binding proteins and 56% sensitivity for the human proteome. In addition, the method provides a reasonably accurate prediction of DNA-binding residues in proteins based on predicted DNA-binding complex structures. Its application to human proteome leads to more than 300 novel DNA-binding proteins; some of these predicted structures were validated by known structures of homologous proteins in APO forms. The method [SPOT-Seq (DNA)] is available as an on-line server at http://sparks-lab.org.

Citing Articles

Twenty years of advances in prediction of nucleic acid-binding residues in protein sequences.

Basu S, Yu J, Kihara D, Kurgan L Brief Bioinform. 2025; 26(1).

PMID: 39833102 PMC: 11745544. DOI: 10.1093/bib/bbaf016.


A comprehensive review of protein-centric predictors for biomolecular interactions: from proteins to nucleic acids and beyond.

Jia P, Zhang F, Wu C, Li M Brief Bioinform. 2024; 25(3).

PMID: 38739759 PMC: 11089422. DOI: 10.1093/bib/bbae162.


Deep-WET: a deep learning-based approach for predicting DNA-binding proteins using word embedding techniques with weighted features.

Mahmud S, Goh K, Faruk Hosen M, Nandi D, Shoombuatong W Sci Rep. 2024; 14(1):2961.

PMID: 38316843 PMC: 10844231. DOI: 10.1038/s41598-024-52653-9.


DBP-iDWT: Improving DNA-Binding Proteins Prediction Using Multi-Perspective Evolutionary Profile and Discrete Wavelet Transform.

Ali F, Barukab O, Gadicha A, Patil S, Alghushairy O, Sarhan A Comput Intell Neurosci. 2022; 2022:2987407.

PMID: 36211019 PMC: 9534628. DOI: 10.1155/2022/2987407.


DP-BINDER: machine learning model for prediction of DNA-binding proteins by fusing evolutionary and physicochemical information.

Ali F, Ahmed S, Swati Z, Akbar S J Comput Aided Mol Des. 2019; 33(7):645-658.

PMID: 31123959 DOI: 10.1007/s10822-019-00207-x.


References
1.
Mariani V, Kiefer F, Schmidt T, Haas J, Schwede T . Assessment of template based protein structure predictions in CASP9. Proteins. 2011; 79 Suppl 10:37-58. DOI: 10.1002/prot.23177. View

2.
Gao M, Skolnick J . DBD-Hunter: a knowledge-based method for the prediction of DNA-protein interactions. Nucleic Acids Res. 2008; 36(12):3978-92. PMC: 2475642. DOI: 10.1093/nar/gkn332. View

3.
Ahmad S, Gromiha M, Sarai A . Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information. Bioinformatics. 2004; 20(4):477-86. DOI: 10.1093/bioinformatics/btg432. View

4.
Xu T, Sampath A, Chao A, Wen D, Nanao M, Chene P . Structure of the Dengue virus helicase/nucleoside triphosphatase catalytic domain at a resolution of 2.4 A. J Virol. 2005; 79(16):10278-88. PMC: 1182654. DOI: 10.1128/JVI.79.16.10278-10288.2005. View

5.
Cai Y, Lin S . Support vector machines for predicting rRNA-, RNA-, and DNA-binding proteins from amino acid sequence. Biochim Biophys Acta. 2003; 1648(1-2):127-33. DOI: 10.1016/s1570-9639(03)00112-2. View