A Continuum of Evolving De Novo Genes Drives Protein-Coding Novelty in Drosophila
Overview
Affiliations
Orphan genes, lacking detectable homologs in outgroup species, typically represent 10-30% of eukaryotic genomes. Efforts to find the source of these young genes indicate that de novo emergence from non-coding DNA may in part explain their prevalence. Here, we investigate the roots of orphan gene emergence in the Drosophila genus. Across the annotated proteomes of twelve species, we find 6297 orphan genes within 4953 taxon-specific clusters of orthologs. By inferring the ancestral DNA as non-coding for between 550 and 2467 (8.7-39.2%) of these genes, we describe for the first time how de novo emergence contributes to the abundance of clade-specific Drosophila genes. In support of them having functional roles, we show that de novo genes have robust expression and translational support. However, the distinct nucleotide sequences of de novo genes, which have characteristics intermediate between intergenic regions and conserved genes, reflect their recent birth from non-coding DNA. We find that de novo genes encode more disordered proteins than both older genes and intergenic regions. Together, our results suggest that gene emergence from non-coding DNA provides an abundant source of material for the evolution of new proteins. Following gene birth, gradual evolution over large evolutionary timescales moulds sequence properties towards those of conserved genes, resulting in a continuum of properties whose starting points depend on the nucleotide sequences of an initial pool of novel genes.
Aldrovandi S, Fajardo Castro J, Ullrich K, Karger A, Luria V, Tautz D Genome Biol Evol. 2024; 16(12).
PMID: 39663928 PMC: 11635099. DOI: 10.1093/gbe/evae175.
Orphan genes are not a distinct biological entity.
Pereira A, Marano M, Bathala R, Zaragoza R, Neira A, Samano A Bioessays. 2024; 47(1):e2400146.
PMID: 39491810 PMC: 11662153. DOI: 10.1002/bies.202400146.
The ribosome profiling landscape of yeast reveals a high diversity in pervasive translation.
Papadopoulos C, Arbes H, Cornu D, Chevrollier N, Blanchet S, Roginski P Genome Biol. 2024; 25(1):268.
PMID: 39402662 PMC: 11472626. DOI: 10.1186/s13059-024-03403-7.
Sequence, Structure, and Functional Space of Drosophila De Novo Proteins.
Middendorf L, Ravi Iyengar B, Eicholt L Genome Biol Evol. 2024; 16(8).
PMID: 39212966 PMC: 11363682. DOI: 10.1093/gbe/evae176.
An orphan gene is essential for efficient sperm entry into eggs in .
Guay S, Patel P, Thomalla J, McDermott K, OToole J, Arnold S bioRxiv. 2024; .
PMID: 39149251 PMC: 11326263. DOI: 10.1101/2024.08.08.607187.