» Articles » PMID: 30346517

From De Novo to "De Nono": The Majority of Novel Protein-Coding Genes Identified with Phylostratigraphy Are Old Genes or Recent Duplicates

Overview
Date 2018 Oct 23
PMID 30346517
Citations 29
Authors
Affiliations
Soon will be listed here.
Abstract

The evolution of novel protein-coding genes from noncoding regions of the genome is one of the most compelling pieces of evidence for genetic innovations in nature. One popular approach to identify de novo genes is phylostratigraphy, which consists of determining the approximate time of origin (age) of a gene based on its distribution along a species phylogeny. Several studies have revealed significant flaws in determining the age of genes, including de novo genes, using phylostratigraphy alone. However, the rate of false positives in de novo gene surveys, based on phylostratigraphy, remains unknown. Here, I reanalyze the findings from three studies, two of which identified tens to hundreds of rodent-specific de novo genes adopting a phylostratigraphy-centered approach. Most putative de novo genes discovered in these investigations are no longer included in recently updated mouse gene sets. Using a combination of synteny information and sequence similarity searches, I show that ∼60% of the remaining 381 putative de novo genes share homology with genes from other vertebrates, originated through gene duplication, and/or share no synteny information with nonrodent mammals. These results led to an estimated rate of ∼12 de novo genes per million years in mouse. Contrary to a previous study (Wilson BA, Foy SG, Neme R, Masel J. 2017. Young genes are highly disordered as predicted by the preadaptation hypothesis of de novo gene birth. Nat Ecol Evol. 1:0146), I found no evidence supporting the preadaptation hypothesis of de novo gene formation. Nearly half of the de novo genes confirmed in this study are within older genes, indicating that co-option of preexisting regulatory regions and a higher GC content may facilitate the origin of novel genes.

Citing Articles

The De Novo Emergence of Two Brain Genes in the Human Lineage Appears to be Unsupported.

Hannon Bozorgmehr J J Mol Evol. 2024; 93(1):3-10.

PMID: 39725692 DOI: 10.1007/s00239-024-10227-3.


Orphan genes are not a distinct biological entity.

Pereira A, Marano M, Bathala R, Zaragoza R, Neira A, Samano A Bioessays. 2024; 47(1):e2400146.

PMID: 39491810 PMC: 11662153. DOI: 10.1002/bies.202400146.


Dollo Parsimony Overestimates Ancestral Gene Content Reconstructions.

Galvez-Morante A, Gueguen L, Natsidis P, Telford M, Richter D Genome Biol Evol. 2024; 16(4).

PMID: 38518756 PMC: 10995720. DOI: 10.1093/gbe/evae062.


Four classic "de novo" genes all have plausible homologs and likely evolved from retro-duplicated or pseudogenic sequences.

Hannon Bozorgmehr J Mol Genet Genomics. 2024; 299(1):6.

PMID: 38315248 DOI: 10.1007/s00438-023-02090-6.


Statistical analysis of synonymous and stop codons in pseudo-random and real sequences as a function of GC content.

Wesp V, Theissen G, Schuster S Sci Rep. 2023; 13(1):22996.

PMID: 38151539 PMC: 10752896. DOI: 10.1038/s41598-023-49626-9.


References
1.
Lu T, Leu J, Lin W . A Comprehensive Analysis of Transcript-Supported De Novo Genes in Saccharomyces sensu stricto Yeasts. Mol Biol Evol. 2017; 34(11):2823-2838. PMC: 5850716. DOI: 10.1093/molbev/msx210. View

2.
Blankenberg D, Taylor J, Nekrutenko A . Making whole genome multiple alignments usable for biologists. Bioinformatics. 2011; 27(17):2426-8. PMC: 3157923. DOI: 10.1093/bioinformatics/btr398. View

3.
Han M, Thomas G, Lugo-Martinez J, Hahn M . Estimating gene gain and loss rates in the presence of error in genome assembly and annotation using CAFE 3. Mol Biol Evol. 2013; 30(8):1987-97. DOI: 10.1093/molbev/mst100. View

4.
. The common marmoset genome provides insight into primate biology and evolution. Nat Genet. 2014; 46(8):850-7. PMC: 4138798. DOI: 10.1038/ng.3042. View

5.
McLysaght A, Hurst L . Open questions in the study of de novo genes: what, how and why. Nat Rev Genet. 2016; 17(9):567-78. DOI: 10.1038/nrg.2016.78. View