» Articles » PMID: 22563380

Clusters of Ancestrally Related Genes That Show Paralogy in Whole or in Part Are a Major Feature of the Genomes of Humans and Other Species

Overview
Journal PLoS One
Date 2012 May 8
PMID 22563380
Citations 3
Authors
Affiliations
Soon will be listed here.
Abstract

Arrangements of genes along chromosomes are a product of evolutionary processes, and we can expect that preferable arrangements will prevail over the span of evolutionary time, often being reflected in the non-random clustering of structurally and/or functionally related genes. Such non-random arrangements can arise by two distinct evolutionary processes: duplications of DNA sequences that give rise to clusters of genes sharing both sequence similarity and common sequence features and the migration together of genes related by function, but not by common descent. To provide a background for distinguishing between the two, which is important for future efforts to unravel the evolutionary processes involved, we here provide a description of the extent to which ancestrally related genes are found in proximity.Towards this purpose, we combined information from five genomic datasets, InterPro, SCOP, PANTHER, Ensembl protein families, and Ensembl gene paralogs. The results are provided in publicly available datasets (http://cgd.jax.org/datasets/clustering/paraclustering.shtml) describing the extent to which ancestrally related genes are in proximity beyond what is expected by chance (i.e. form paraclusters) in the human and nine other vertebrate genomes, as well as the D. melanogaster, C. elegans, A. thaliana, and S. cerevisiae genomes. With the exception of Saccharomyces, paraclusters are a common feature of the genomes we examined. In the human genome they are estimated to include at least 22% of all protein coding genes. Paraclusters are far more prevalent among some gene families than others, are highly species or clade specific and can evolve rapidly, sometimes in response to environmental cues. Altogether, they account for a large portion of the functional clustering previously reported in several genomes.

Citing Articles

Emergence and influence of sequence bias in evolutionarily malleable, mammalian tandem arrays.

Brovkina M, Chapman M, Holding M, Clowney E BMC Biol. 2023; 21(1):179.

PMID: 37612705 PMC: 10463633. DOI: 10.1186/s12915-023-01673-4.


RegenDbase: a comparative database of noncoding RNA regulation of tissue regeneration circuits across multiple taxa.

King B, Rosenstein M, Smith A, Dykeman C, Smith G, Yin V NPJ Regen Med. 2018; 3:10.

PMID: 29872545 PMC: 5973935. DOI: 10.1038/s41536-018-0049-0.


Meiotic DSBs and the control of mammalian recombination.

Paigen K, Petkov P Cell Res. 2012; 22(12):1624-6.

PMID: 22801475 PMC: 3515751. DOI: 10.1038/cr.2012.109.

References
1.
Teichmann S, Chothia C . Immunoglobulin superfamily proteins in Caenorhabditis elegans. J Mol Biol. 2000; 296(5):1367-83. DOI: 10.1006/jmbi.1999.3497. View

2.
Thomas J, Schneider S . Coevolution of retroelements and tandem zinc finger genes. Genome Res. 2011; 21(11):1800-12. PMC: 3205565. DOI: 10.1101/gr.121749.111. View

3.
Rizzon C, Ponger L, Gaut B . Striking similarities in the genomic distribution of tandemly arrayed genes in Arabidopsis and rice. PLoS Comput Biol. 2006; 2(9):e115. PMC: 1557586. DOI: 10.1371/journal.pcbi.0020115. View

4.
Zhang T, Haws P, Wu Q . Multiple variable first exons: a mechanism for cell- and tissue-specific gene regulation. Genome Res. 2003; 14(1):79-89. PMC: 314283. DOI: 10.1101/gr.1225204. View

5.
Vilella A, Severin J, Ureta-Vidal A, Heng L, Durbin R, Birney E . EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. Genome Res. 2008; 19(2):327-35. PMC: 2652215. DOI: 10.1101/gr.073585.107. View