» Articles » PMID: 15851686

Sequence Signatures and the Probabilistic Identification of Proteins in the Myc-Max-Mad Network

Overview
Specialty Science
Date 2005 Apr 27
PMID 15851686
Citations 15
Authors
Affiliations
Soon will be listed here.
Abstract

Accurate identification of specific groups of proteins by their amino acid sequence is an important goal in genome research. Here we combine information theory with fuzzy logic search procedures to identify sequence signatures or predictive motifs for members of the Myc-Max-Mad transcription factor network. Myc is a well known oncoprotein, and this family is involved in cell proliferation, apoptosis, and differentiation. We describe a small set of amino acid sites from the N-terminal portion of the basic helix-loop-helix (bHLH) domain that provide very accurate sequence signatures for the Myc-Max-Mad transcription factor network and three of its member proteins. A predictive motif involving 28 contiguous bHLH sequence elements found 337 network proteins in the GenBank NR database with no mismatches or misidentifications. This motif also identifies at least one previously unknown fungal protein with strong affinity to the Myc-Max-Mad network. Another motif found 96% of known Myc protein sequences with only a single mismatch, including sequences from genomes previously not thought to contain Myc proteins. The predictive motif for Myc is very similar to the ancestral sequence for the Myc group estimated from phylogenetic analyses. Based on available crystal structure studies, this motif is discussed in terms of its functional consequences. Our results provide insight into evolutionary diversification of DNA binding and dimerization in a well characterized family of regulatory proteins and provide a method of identifying signature motifs in protein families.

Citing Articles

Identification and characterization of tyrosine kinases in anole lizard indicate the conserved tyrosine kinase repertoire in vertebrates.

Liu A, He F, Gu X Mol Genet Genomics. 2017; 292(6):1405-1418.

PMID: 28819830 DOI: 10.1007/s00438-017-1356-7.


Venom Insulins of Cone Snails Diversify Rapidly and Track Prey Taxa.

Safavi-Hemami H, Lu A, Li Q, Fedosov A, Biggs J, Corneli P Mol Biol Evol. 2016; 33(11):2924-2934.

PMID: 27524826 PMC: 5062327. DOI: 10.1093/molbev/msw174.


Genome-wide identification and analysis of basic helix-loop-helix domains in dog, Canis lupus familiaris.

Wang X, Wang Y, Liu A, Liu X, Zhou Y, Yao Q Mol Genet Genomics. 2014; 290(2):633-48.

PMID: 25403511 DOI: 10.1007/s00438-014-0950-1.


Classification and evolutionary analysis of the basic helix-loop-helix gene family in the green anole lizard, Anolis carolinensis.

Liu A, Wang Y, Zhang D, Wang X, Song H, Dang C Mol Genet Genomics. 2013; 288(7-8):365-80.

PMID: 23756994 DOI: 10.1007/s00438-013-0755-7.


Accurate discrimination of bHLH domains in plants, animals, and fungi using biologically meaningful sites.

Sailsbery J, Dean R BMC Evol Biol. 2012; 12:154.

PMID: 22920570 PMC: 3502508. DOI: 10.1186/1471-2148-12-154.


References
1.
Buck M, Atchley W . Phylogenetic analysis of plant basic helix-loop-helix proteins. J Mol Evol. 2003; 56(6):742-50. DOI: 10.1007/s00239-002-2449-3. View

2.
Morgenstern B . DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment. Bioinformatics. 1999; 15(3):211-8. DOI: 10.1093/bioinformatics/15.3.211. View

3.
Bailey T, Elkan C . Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol. 1994; 2:28-36. View

4.
Nair S, Burley S . X-ray structures of Myc-Max and Mad-Max recognizing DNA. Molecular bases of regulation by proto-oncogenic transcription factors. Cell. 2003; 112(2):193-205. DOI: 10.1016/s0092-8674(02)01284-9. View

5.
Atchley W, FITCH W . A natural classification of the basic helix-loop-helix class of transcription factors. Proc Natl Acad Sci U S A. 1997; 94(10):5172-6. PMC: 24651. DOI: 10.1073/pnas.94.10.5172. View