» Articles » PMID: 17099230

EVEREST: a Collection of Evolutionary Conserved Protein Domains

Overview
Specialty Biochemistry
Date 2006 Nov 14
PMID 17099230
Citations 7
Authors
Affiliations
Soon will be listed here.
Abstract

Protein domains are subunits of proteins that recur throughout the protein world. There are many definitions attempting to capture the essence of a protein domain, and several systems that identify protein domains and classify them into families. EVEREST, recently described in Portugaly et al. (2006) BMC Bioinformatics, 7, 277, is one such system that performs the task automatically, using protein sequence alone. Herein we describe EVEREST release 2.0, consisting of 20,029 families, each defined by one or more HMMs. The current EVEREST database was constructed by scanning UniProt 8.1 and all PDB sequences (total over 3,000,000 sequences) with each of the EVEREST families. EVEREST annotates 64% of all sequences, and covers 59% of all residues. EVEREST is available at http://www.everest.cs.huji.ac.il/. The website provides annotations given by SCOP, CATH, Pfam A and EVEREST. It allows for browsing through the families of each of those sources, graphically visualizing the domain organization of the proteins in the family. The website also provides access to analyzes of relationships between domain families, within and across domain definition systems. Users can upload sequences for analysis by the set of EVEREST families. Finally an advanced search form allows querying for families matching criteria regarding novelty, phylogenetic composition and more.

Citing Articles

ThreaDomEx: a unified platform for predicting continuous and discontinuous protein domains by multiple-threading and segment assembly.

Wang Y, Wang J, Li R, Shi Q, Xue Z, Zhang Y Nucleic Acids Res. 2017; 45(W1):W400-W407.

PMID: 28498994 PMC: 5793814. DOI: 10.1093/nar/gkx410.


Extending Protein Domain Boundary Predictors to Detect Discontinuous Domains.

Xue Z, Jang R, Govindarajoo B, Huang Y, Wang Y PLoS One. 2015; 10(10):e0141541.

PMID: 26502173 PMC: 4621036. DOI: 10.1371/journal.pone.0141541.


The language of the protein universe.

Scaiewicz A, Levitt M Curr Opin Genet Dev. 2015; 35:50-6.

PMID: 26451980 PMC: 4695241. DOI: 10.1016/j.gde.2015.08.010.


ProtoNet 6.0: organizing 10 million protein sequences in a compact hierarchical family tree.

Rappoport N, Karsenty S, Stern A, Linial N, Linial M Nucleic Acids Res. 2011; 40(Database issue):D313-20.

PMID: 22121228 PMC: 3245180. DOI: 10.1093/nar/gkr1027.


More than 1,001 problems with protein domain databases: transmembrane regions, signal peptides and the issue of sequence homology.

Wong W, Maurer-Stroh S, Eisenhaber F PLoS Comput Biol. 2010; 6(7):e1000867.

PMID: 20686689 PMC: 2912341. DOI: 10.1371/journal.pcbi.1000867.


References
1.
Wheeler D, Chappey C, Lash A, Leipe D, Madden T, Schuler G . Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 1999; 28(1):10-4. PMC: 102437. DOI: 10.1093/nar/28.1.10. View

2.
Schultz J, Copley R, Doerks T, Ponting C, Bork P . SMART: a web-based tool for the study of genetically mobile domains. Nucleic Acids Res. 1999; 28(1):231-4. PMC: 102444. DOI: 10.1093/nar/28.1.231. View

3.
Berman H, Westbrook J, Feng Z, Gilliland G, Bhat T, Weissig H . The Protein Data Bank. Nucleic Acids Res. 1999; 28(1):235-42. PMC: 102472. DOI: 10.1093/nar/28.1.235. View

4.
Chandonia J, Walker N, Lo Conte L, Koehl P, Levitt M, Brenner S . ASTRAL compendium enhancements. Nucleic Acids Res. 2001; 30(1):260-3. PMC: 99063. DOI: 10.1093/nar/30.1.260. View

5.
Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy S . The Pfam protein families database. Nucleic Acids Res. 2001; 30(1):276-80. PMC: 99071. DOI: 10.1093/nar/30.1.276. View