Protein Fold Similarity Estimated by a Probabilistic Approach Based on C(alpha)-C(alpha) Distance Comparison
Overview
Molecular Biology
Authors
Affiliations
The distribution of the C(alpha)-C(alpha) distances between residues separated by three to 30 amino acid residues is highly characteristic of protein folds and makes it possible to identify them from a straightforward comparison of the distance histograms. The comparison is carried out by contingency table analysis and yields a probability of identity (PRIDE score), with values between zero and 1. For closely related structures, PRIDE is highly correlated with the root-mean-square distance between C(alpha) atoms, but it provides a correct classification even for unrelated structures for which a structural alignment is not meaningful. For example, an analysis of the CATH database of fold structures showed that 98.8% of the folds fall into the correct CATH homologous superfamily category, based on the highest PRIDE score obtained. Structural alignment and secondary-structure assignment are not necessary for the calculation of PRIDE, which is fast enough to allow the scanning of large databases.
Veno J, Rahman R, Masomian M, Mohamad Ali M, Kamarudin N Molecules. 2019; 24(17).
PMID: 31480403 PMC: 6749283. DOI: 10.3390/molecules24173169.
Molloy K, Van M, Barbara D, Shehu A BMC Bioinformatics. 2014; 15 Suppl 8:S4.
PMID: 25080993 PMC: 4120149. DOI: 10.1186/1471-2105-15-S8-S4.
GOSSIP: a method for fast and accurate global alignment of protein structures.
Kifer I, Nussinov R, Wolfson H Bioinformatics. 2011; 27(7):925-32.
PMID: 21296751 PMC: 3065682. DOI: 10.1093/bioinformatics/btr044.
Fast and accurate protein substructure searching with simulated annealing and GPUs.
Stivala A, Stuckey P, Wirth A BMC Bioinformatics. 2010; 11:446.
PMID: 20813068 PMC: 2944279. DOI: 10.1186/1471-2105-11-446.
Budowski-Tal I, Nov Y, Kolodny R Proc Natl Acad Sci U S A. 2010; 107(8):3481-6.
PMID: 20133727 PMC: 2840415. DOI: 10.1073/pnas.0914097107.