Parameterization and Classification of the Protein Universe Via Geometric Techniques
Overview
Molecular Biology
Affiliations
We present a scheme for the classification of 3487 non-redundant protein structures into 1207 non-hierarchical clusters by using recurring structural patterns of three to six amino acids as keys of classification. This results in several signature patterns, which seem to decide membership of a protein in a functional category. The patterns provide clues to the key residues involved in functional sites as well as in protein-protein interaction. The discovered patterns include a "glutamate double bridge" of superoxide dismutase, the functional interface of the serine protease and inhibitor, interface of homo/hetero dimers, and functional sites of several enzyme families. We use geometric invariants to decide superimposability of structural patterns. This allows the parameterization of patterns and discovery of recurring patterns via clustering. The geometric invariant-based approach eliminates the computationally explosive step of pair-wise comparison of structures. The results provide a vast resource for the biologists for experimental validation of the proposed functional sites, and for the design of synthetic enzymes, inhibitors and drugs.
Wang H, Wang X, Li X, Zhang Y, Dai Y, Guo C Lipids Health Dis. 2012; 11:124.
PMID: 23016923 PMC: 3567427. DOI: 10.1186/1476-511X-11-124.
Sharma A, Tendulkar A, Wangikar P Bioinformation. 2011; 5(8):341-9.
PMID: 21383922 PMC: 3046039. DOI: 10.6026/97320630005341.
Protein local conformations arise from a mixture of Gaussian distributions.
Tendulkar A, Ogunnaike B, Wangikar P J Biosci. 2007; 32(5):899-908.
PMID: 17914232 DOI: 10.1007/s12038-007-0090-4.
PAR-3D: a server to predict protein active site residues.
Goyal K, Mohanty D, Mande S Nucleic Acids Res. 2007; 35(Web Server issue):W503-5.
PMID: 17478506 PMC: 1933233. DOI: 10.1093/nar/gkm252.
Fast prediction of protein domain boundaries using conserved local patterns.
Joshi R, Samant V J Mol Model. 2006; 12(6):943-52.
PMID: 16649034 DOI: 10.1007/s00894-006-0116-0.