» Articles » PMID: 20003297

Expansion of Tandem Repeats in Sea Anemone Nematostella Vectensis Proteome: A Source for Gene Novelty?

Overview
Journal BMC Genomics
Publisher Biomed Central
Specialty Genetics
Date 2009 Dec 17
PMID 20003297
Citations 3
Authors
Affiliations
Soon will be listed here.
Abstract

Background: The complete proteome of the starlet sea anemone, Nematostella vectensis, provides insights into gene invention dating back to the Cnidarian-Bilaterian ancestor. With the addition of the complete proteomes of Hydra magnipapillata and Monosiga brevicollis, the investigation of proteins having unique features in early metazoan life has become practical. We focused on the properties and the evolutionary trends of tandem repeat (TR) sequences in Cnidaria proteomes.

Results: We found that 11-16% of N. vectensis proteins contain tandem repeats. Most TRs cover 150 amino acid segments that are comprised of basic units of 5-20 amino acids. In total, the N. Vectensis proteome has about 3300 unique TR-units, but only a small fraction of them are shared with H. magnipapillata, M. brevicollis, or mammalian proteomes. The overall abundance of these TRs stands out relative to that of 14 proteomes representing the diversity among eukaryotes and within the metazoan world. TR-units are characterized by a unique composition of amino acids, with cysteine and histidine being over-represented. Structurally, most TR-segments are associated with coiled and disordered regions. Interestingly, 80% of the TR-segments can be read in more than one open reading frame. For over 100 of them, translation of the alternative frames would result in long proteins. Most domain families that are characterized as repeats in eukaryotes are found in the TR-proteomes from Nematostella and Hydra.

Conclusions: While most TR-proteins have originated from prediction tools and are still awaiting experimental validations, supportive evidence exists for hundreds of TR-units in Nematostella. The existence of TR-proteins in early metazoan life may have served as a robust mode for novel genes with previously overlooked structural and functional characteristics.

Citing Articles

A haplotype resolved chromosomal level avocado genome allows analysis of novel avocado genes.

Nath O, Fletcher S, Hayward A, Shaw L, Masouleh A, Furtado A Hortic Res. 2022; 9:uhac157.

PMID: 36204209 PMC: 9531333. DOI: 10.1093/hr/uhac157.


Short toxin-like proteins abound in Cnidaria genomes.

Tirosh Y, Linial I, Askenazi M, Linial M Toxins (Basel). 2012; 4(11):1367-84.

PMID: 23202321 PMC: 3509713. DOI: 10.3390/toxins4111367.


Genetic diversity of the allodeterminant alr2 in Hydractinia symbiolongicarpus.

Rosengarten R, Moreno M, Lakkis F, Buss L, Dellaporta S Mol Biol Evol. 2010; 28(2):933-47.

PMID: 20966116 PMC: 3108555. DOI: 10.1093/molbev/msq282.

References
1.
Kashi Y, King D, Soller M . Simple sequence repeats as a source of quantitative genetic variation. Trends Genet. 1997; 13(2):74-8. DOI: 10.1016/s0168-9525(97)01008-1. View

2.
Newman A, Cooper J . XSTREAM: a practical algorithm for identification and architecture modeling of tandem repeats in protein sequences. BMC Bioinformatics. 2007; 8:382. PMC: 2233649. DOI: 10.1186/1471-2105-8-382. View

3.
Heger A, Holm L . Rapid automatic detection and alignment of repeats in protein sequences. Proteins. 2000; 41(2):224-37. DOI: 10.1002/1097-0134(20001101)41:2<224::aid-prot70>3.0.co;2-z. View

4.
Heringa J . Detection of internal repeats: how common are they?. Curr Opin Struct Biol. 1998; 8(3):338-45. DOI: 10.1016/s0959-440x(98)80068-7. View

5.
Chung W, Wadhawan S, Szklarczyk R, Kosakovsky Pond S, Nekrutenko A . A first look at ARFome: dual-coding genes in mammalian genomes. PLoS Comput Biol. 2007; 3(5):e91. PMC: 1868773. DOI: 10.1371/journal.pcbi.0030091. View