High-throughput Computational Structure-based Characterization of Protein Families: START Domains and Implications for Structural Genomics

Overview

Journal J Struct Funct Genomics

Specialty Genetics

Date 2010 Apr 13

PMID 20383749

Citations 3

Authors

Hunjoong Lee

Zhaohui Li

Antonina Silkov

Markus Fischer

Donald Petrey

Barry Honig

Diana Murray

Affiliations

Soon will be listed here.

Abstract

SkyLine, a high-throughput homology modeling pipeline tool, detects and models true sequence homologs to a given protein structure. Structures and models are stored in SkyBase with links to computational function annotation, as calculated by MarkUs. The SkyLine/SkyBase/MarkUs technology represents a novel structure-based approach that is more objective and versatile than other protein classification resources. This structure-centric strategy provides a multi-dimensional organization and coverage of protein space at the levels of family, function, and genome. The concept of "modelability", the ability to model sequences on related structures, provides a reliable criterion for membership in a protein family ("leverage") and underlies the unique success of this approach. The overall procedure is illustrated by its application to START domains, which comprise a Biomedical Theme for the Northeast Structural Genomics Consortium as part of the Protein Structure Initiative. START domains are typically involved in the non-vesicular transport of lipids. While 19 experimentally determined structures are available, the family, whose evolutionary hierarchy is not well determined, is highly sequence diverse, and the ligand-binding potential of many family members is unknown. The SkyLine/SkyBase/MarkUs approach provides significant insights and predicts: (1) many more family members (approximately 4,000) than any other resource; (2) the function for a large number of unannotated proteins; (3) instances of START domains in genomes from which they were thought to be absent; and (4) the existence of two types of novel proteins, those containing dual START domain and those containing N-terminal START domains.

Citing Articles

The Lipid Transfer Protein StarD7: Structure, Function, and Regulation.

Flores-Martin J, Rena V, Angeletti S, Panzetta-Dutari G, Genti-Raimondi S Int J Mol Sci. 2013; 14(3):6170-86.

PMID: 23507753 PMC: 3634439. DOI: 10.3390/ijms14036170.

Using structure to explore the sequence alignment space of remote homologs.

Kuziemko A, Honig B, Petrey D PLoS Comput Biol. 2011; 7(10):e1002175.

PMID: 21998567 PMC: 3188491. DOI: 10.1371/journal.pcbi.1002175.

Genome-wide structural analysis reveals novel membrane binding properties of AP180 N-terminal homology (ANTH) domains.

Silkov A, Yoon Y, Lee H, Gokhale N, Adu-Gyamfi E, Stahelin R J Biol Chem. 2011; 286(39):34155-63.

PMID: 21828048 PMC: 3190782. DOI: 10.1074/jbc.M111.265611.

References

Landau M, Mayrose I, Rosenberg Y, Glaser F, Martz E, Pupko T . ConSurf 2005: the projection of evolutionary conservation scores of residues on protein structures. Nucleic Acids Res. 2005; 33(Web Server issue):W299-302. PMC: 1160131. DOI: 10.1093/nar/gki370. View

Dessailly B, Nair R, Jaroszewski L, Fajardo J, Kouranov A, Lee D . PSI-2: structural genomics to cover protein domain family space. Structure. 2009; 17(6):869-81. PMC: 2920419. DOI: 10.1016/j.str.2009.03.015. View

Schwede T, Kopp J, Guex N, Peitsch M . SWISS-MODEL: An automated protein homology-modeling server. Nucleic Acids Res. 2003; 31(13):3381-5. PMC: 168927. DOI: 10.1093/nar/gkg520. View

Finn R, Tate J, Mistry J, Coggill P, Sammut S, Hotz H . The Pfam protein families database. Nucleic Acids Res. 2007; 36(Database issue):D281-8. PMC: 2238907. DOI: 10.1093/nar/gkm960. View

Terwilliger T, Stuart D, Yokoyama S . Lessons from structural genomics. Annu Rev Biophys. 2009; 38:371-83. PMC: 2847842. DOI: 10.1146/annurev.biophys.050708.133740. View

Sanchez R, Sali A . Large-scale protein structure modeling of the Saccharomyces cerevisiae genome. Proc Natl Acad Sci U S A. 1998; 95(23):13597-602. PMC: 24864. DOI: 10.1073/pnas.95.23.13597. View

Tsujishita Y, Hurley J . Structure and lipid transport mechanism of a StAR-related domain. Nat Struct Biol. 2000; 7(5):408-14. DOI: 10.1038/75192. View

Sali A, Blundell T . Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol. 1993; 234(3):779-815. DOI: 10.1006/jmbi.1993.1626. View

Schrick K, Nguyen D, Karlowski W, Mayer K . START lipid/sterol-binding domains are amplified in plants and are predominantly associated with homeodomain transcription factors. Genome Biol. 2004; 5(6):R41. PMC: 463074. DOI: 10.1186/gb-2004-5-6-r41. View

10.

Alpy F, Tomasetto C . Give lipids a START: the StAR-related lipid transfer (START) domain in mammals. J Cell Sci. 2005; 118(Pt 13):2791-801. DOI: 10.1242/jcs.02485. View

11.

Hanada K, Kumagai K, Tomishige N, Kawano M . CERT and intracellular trafficking of ceramide. Biochim Biophys Acta. 2007; 1771(6):644-53. DOI: 10.1016/j.bbalip.2007.01.009. View

12.

Im Y, Raychaudhuri S, Prinz W, Hurley J . Structural mechanism for sterol sensing and transport by OSBP-related proteins. Nature. 2005; 437(7055):154-8. PMC: 1431608. DOI: 10.1038/nature03923. View

13.

Marchler-Bauer A, Bryant S . CD-Search: protein domain annotations on the fly. Nucleic Acids Res. 2004; 32(Web Server issue):W327-31. PMC: 441592. DOI: 10.1093/nar/gkh454. View

14.

Nicholls A, Sharp K, Honig B . Protein folding and association: insights from the interfacial and thermodynamic properties of hydrocarbons. Proteins. 1991; 11(4):281-96. DOI: 10.1002/prot.340110407. View

15.

Benson D, Karsch-Mizrachi I, Lipman D, Ostell J, Wheeler D . GenBank: update. Nucleic Acids Res. 2003; 32(Database issue):D23-6. PMC: 308779. DOI: 10.1093/nar/gkh045. View

16.

Kanno K, Wu M, Scapa E, Roderick S, Cohen D . Structure and function of phosphatidylcholine transfer protein (PC-TP)/StarD2. Biochim Biophys Acta. 2007; 1771(6):654-62. PMC: 2743068. DOI: 10.1016/j.bbalip.2007.04.003. View

17.

Eisenberg D, Luthy R, Bowie J . VERIFY3D: assessment of protein models with three-dimensional profiles. Methods Enzymol. 1997; 277:396-404. DOI: 10.1016/s0076-6879(97)77022-8. View

18.

Petrey D, Fischer M, Honig B . Structural relationships among proteins with different global topologies and their implications for function annotation strategies. Proc Natl Acad Sci U S A. 2009; 106(41):17377-82. PMC: 2765090. DOI: 10.1073/pnas.0907971106. View

19.

Berman H, Westbrook J . The impact of structural genomics on the protein data bank. Am J Pharmacogenomics. 2004; 4(4):247-52. DOI: 10.2165/00129785-200404040-00004. View

20.

Mirkovic N, Li Z, Parnassa A, Murray D . Strategies for high-throughput comparative modeling: applications to leverage analysis in structural genomics and protein family organization. Proteins. 2006; 66(4):766-77. DOI: 10.1002/prot.21191. View