» Articles » PMID: 21689388

Dissecting Protein Loops with a Statistical Scalpel Suggests a Functional Implication of Some Structural Motifs

Overview
Publisher Biomed Central
Specialty Biology
Date 2011 Jun 22
PMID 21689388
Citations 5
Authors
Affiliations
Soon will be listed here.
Abstract

Background: One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function.

Results: Here, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM.

Conclusions: Our findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins.

Citing Articles

ANN based prediction of ligand binding sites outside deep cavities to facilitate drug designing.

Singh K, Singh Malik Y Curr Res Struct Biol. 2024; 7:100144.

PMID: 38681239 PMC: 11047793. DOI: 10.1016/j.crstbi.2024.100144.


Analysis of the HIV-2 protease's adaptation to various ligands: characterization of backbone asymmetry using a structural alphabet.

Triki D, Cano Contreras M, Flatters D, Visseaux B, Descamps D, Camproux A Sci Rep. 2018; 8(1):710.

PMID: 29335428 PMC: 5768731. DOI: 10.1038/s41598-017-18941-3.


Exploring the potential of a structural alphabet-based tool for mining multiple target conformations and target flexibility insight.

Regad L, Cheron J, Triki D, Senac C, Flatters D, Camproux A PLoS One. 2017; 12(8):e0182972.

PMID: 28817602 PMC: 5560695. DOI: 10.1371/journal.pone.0182972.


Detecting protein candidate fragments using a structural alphabet profile comparison approach.

Shen Y, Picord G, Guyon F, Tuffery P PLoS One. 2013; 8(11):e80493.

PMID: 24303019 PMC: 3841190. DOI: 10.1371/journal.pone.0080493.


SA-Mot: a web server for the identification of motifs of interest extracted from protein loops.

Regad L, Saladin A, Maupetit J, Geneix C, Camproux A Nucleic Acids Res. 2011; 39(Web Server issue):W203-9.

PMID: 21665924 PMC: 3125790. DOI: 10.1093/nar/gkr410.

References
1.
Murzin A, Brenner S, Hubbard T, Chothia C . SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol. 1995; 247(4):536-40. DOI: 10.1006/jmbi.1995.0159. View

2.
Fernandez-Fuentes N, Hermoso A, Espadaler J, Querol E, Aviles F, Oliva B . Classification of common functional loops of kinase super-families. Proteins. 2004; 56(3):539-55. DOI: 10.1002/prot.20136. View

3.
Ausiello G, Gherardini P, Marcatili P, Tramontano A, Via A, Helmer-Citterich M . FunClust: a web server for the identification of structural motifs in a set of non-homologous protein structures. BMC Bioinformatics. 2008; 9 Suppl 2:S2. PMC: 2323665. DOI: 10.1186/1471-2105-9-S2-S2. View

4.
Babor M, Greenblatt H, Edelman M, Sobolev V . Flexibility of metal binding sites in proteins on a database scale. Proteins. 2005; 59(2):221-30. DOI: 10.1002/prot.20431. View

5.
van Helden J, Del Olmo M, Perez-Ortin J . Statistical analysis of yeast genomic downstream sequences reveals putative polyadenylation signals. Nucleic Acids Res. 2000; 28(4):1000-10. PMC: 102588. DOI: 10.1093/nar/28.4.1000. View