» Articles » PMID: 17154423

Strategies for High-throughput Comparative Modeling: Applications to Leverage Analysis in Structural Genomics and Protein Family Organization

Overview
Journal Proteins
Date 2006 Dec 13
PMID 17154423
Citations 13
Authors
Affiliations
Soon will be listed here.
Abstract

The technological breakthroughs in structural genomics were designed to facilitate the solution of a sufficient number of structures, so that as many protein sequences as possible can be structurally characterized with the aid of comparative modeling. The leverage of a solved structure is the number and quality of the models that can be produced using the structure as a template for modeling and may be viewed as the "currency" with which the success of a structural genomics endeavor can be measured. Moreover, the models obtained in this way should be valuable to all biologists. To this end, at the Northeast Structural Genomics Consortium (NESG), a modular computational pipeline for automated high-throughput leverage analysis was devised and used to assess the leverage of the 186 unique NESG structures solved during the first phase of the Protein Structure Initiative (January 2000 to July 2005). Here, the results of this analysis are presented. The number of sequences in the nonredundant protein sequence database covered by quality models produced by the pipeline is approximately 39,000, so that the average leverage is approximately 210 models per structure. Interestingly, only 7900 of these models fulfill the stringent modeling criterion of being at least 30% sequence-identical to the corresponding NESG structures. This study shows how high-throughput modeling increases the efficiency of structure determination efforts by providing enhanced coverage of protein structure space. In addition, the approach is useful in refining the boundaries of structural domains within larger protein sequences, subclassifying sequence diverse protein families, and defining structure-based strategies specific to a particular family.

Citing Articles

Structure-Based Approaches for Protein-Protein Interaction Prediction Using Machine Learning and Deep Learning.

Kiouri D, Batsis G, Chasapis C Biomolecules. 2025; 15(1).

PMID: 39858535 PMC: 11763140. DOI: 10.3390/biom15010141.


Predicting peptide-mediated interactions on a genome-wide scale.

Chen T, Petrey D, Garzon J, Honig B PLoS Comput Biol. 2015; 11(5):e1004248.

PMID: 25938916 PMC: 4418708. DOI: 10.1371/journal.pcbi.1004248.


PrePPI: a structure-informed database of protein-protein interactions.

Zhang Q, Petrey D, Garzon J, Deng L, Honig B Nucleic Acids Res. 2012; 41(Database issue):D828-33.

PMID: 23193263 PMC: 3531098. DOI: 10.1093/nar/gks1231.


Structure-based prediction of protein-protein interactions on a genome-wide scale.

Zhang Q, Petrey D, Deng L, Qiang L, Shi Y, Aye Thu C Nature. 2012; 490(7421):556-60.

PMID: 23023127 PMC: 3482288. DOI: 10.1038/nature11503.


The Protein Structure Initiative: achievements and visions for the future.

Montelione G F1000 Biol Rep. 2012; 4:7.

PMID: 22500193 PMC: 3318194. DOI: 10.3410/B4-7.