» Articles » PMID: 34810219

Intergenic ORFs As Elementary Structural Modules of De Novo Gene Birth and Protein Evolution

Overview
Journal Genome Res
Specialty Genetics
Date 2021 Nov 23
PMID 34810219
Citations 17
Authors
Affiliations
Soon will be listed here.
Abstract

The noncoding genome plays an important role in de novo gene birth and in the emergence of genetic novelty. Nevertheless, how noncoding sequences' properties could promote the birth of novel genes and shape the evolution and the structural diversity of proteins remains unclear. Therefore, by combining different bioinformatic approaches, we characterized the fold potential diversity of the amino acid sequences encoded by all intergenic open reading frames (ORFs) of with the aim of (1) exploring whether the structural states' diversity of proteomes is already present in noncoding sequences, and (2) estimating the potential of the noncoding genome to produce novel protein bricks that could either give rise to novel genes or be integrated into pre-existing proteins, thus participating in protein structure diversity and evolution. We showed that amino acid sequences encoded by most yeast intergenic ORFs contain the elementary building blocks of protein structures. Moreover, they encompass the large structural state diversity of canonical proteins, with the majority predicted as foldable. Then, we investigated the early stages of de novo gene birth by reconstructing the ancestral sequences of 70 yeast de novo genes and characterized the sequence and structural properties of intergenic ORFs with a strong translation signal. This enabled us to highlight sequence and structural factors determining de novo gene emergence. Finally, we showed a strong correlation between the fold potential of de novo proteins and one of their ancestral amino acid sequences, reflecting the relationship between the noncoding genome and the protein structure universe.

Citing Articles

The ribosome profiling landscape of yeast reveals a high diversity in pervasive translation.

Papadopoulos C, Arbes H, Cornu D, Chevrollier N, Blanchet S, Roginski P Genome Biol. 2024; 25(1):268.

PMID: 39402662 PMC: 11472626. DOI: 10.1186/s13059-024-03403-7.


De Novo Emerged Gene Search in Eukaryotes with DENSE.

Roginski P, Grandchamp A, Quignot C, Lopes A Genome Biol Evol. 2024; 16(8).

PMID: 39212967 PMC: 11363675. DOI: 10.1093/gbe/evae159.


Ancestral Sequence Reconstruction as a Tool to Detect and Study De Novo Gene Emergence.

Vakirlis N, Acar O, Cherupally V, Carvunis A Genome Biol Evol. 2024; 16(8).

PMID: 39004885 PMC: 11299112. DOI: 10.1093/gbe/evae151.


Are Most Human-Specific Proteins Encoded by Long Noncoding RNAs?.

Sanejouand Y J Mol Evol. 2024; 92(4):363-370.

PMID: 38916610 DOI: 10.1007/s00239-024-10174-z.


Modeling Length Changes in De Novo Open Reading Frames during Neutral Evolution.

Lebherz M, Ravi Iyengar B, Bornberg-Bauer E Genome Biol Evol. 2024; 16(7).

PMID: 38879874 PMC: 11339603. DOI: 10.1093/gbe/evae129.


References
1.
Schavemaker P, Smigiel W, Poolman B . Ribosome surface properties may impose limits on the nature of the cytoplasmic proteome. Elife. 2017; 6. PMC: 5726854. DOI: 10.7554/eLife.30084. View

2.
Wilson B, Foy S, Neme R, Masel J . Young Genes are Highly Disordered as Predicted by the Preadaptation Hypothesis of Gene Birth. Nat Ecol Evol. 2017; 1(6):0146-146. PMC: 5476217. DOI: 10.1038/s41559-017-0146. View

3.
Berezovsky I . Towards descriptor of elementary functions for protein design. Curr Opin Struct Biol. 2019; 58:159-165. DOI: 10.1016/j.sbi.2019.06.010. View

4.
Hocker B . Design of proteins from smaller fragments-learning from evolution. Curr Opin Struct Biol. 2014; 27:56-62. DOI: 10.1016/j.sbi.2014.04.007. View

5.
Radhakrishnan A, Chen Y, Martin S, Alhusaini N, Green R, Coller J . The DEAD-Box Protein Dhh1p Couples mRNA Decay and Translation by Monitoring Codon Optimality. Cell. 2016; 167(1):122-132.e9. PMC: 5635654. DOI: 10.1016/j.cell.2016.08.053. View