Pseudo-messenger RNA: Phantoms of the Transcriptome
Overview
Authors
Affiliations
The mammalian transcriptome harbours shadowy entities that resist classification and analysis. In analogy with pseudogenes, we define pseudo-messenger RNA to be RNA molecules that resemble protein-coding mRNA, but cannot encode full-length proteins owing to disruptions of the reading frame. Using a rigorous computational pipeline, which rules out sequencing errors, we identify 10,679 pseudo-messenger RNAs (approximately half of which are transposon-associated) among the 102,801 FANTOM3 mouse cDNAs: just over 10% of the FANTOM3 transcriptome. These comprise not only transcribed pseudogenes, but also disrupted splice variants of otherwise protein-coding genes. Some may encode truncated proteins, only a minority of which appear subject to nonsense-mediated decay. The presence of an excess of transcripts whose only disruptions are opal stop codons suggests that there are more selenoproteins than currently estimated. We also describe compensatory frameshifts, where a segment of the gene has changed frame but remains translatable. In summary, we survey a large class of non-standard but potentially functional transcripts that are likely to encode genetic information and effect biological processes in novel ways. Many of these transcripts do not correspond cleanly to any identifiable object in the genome, implying fundamental limits to the goal of annotating all functional elements at the genome sequence level.
Long non-coding RNAs: definitions, functions, challenges and recommendations.
Mattick J, Amaral P, Carninci P, Carpenter S, Chang H, Chen L Nat Rev Mol Cell Biol. 2023; 24(6):430-447.
PMID: 36596869 PMC: 10213152. DOI: 10.1038/s41580-022-00566-8.
Virtual Gene Concept and a Corresponding Pragmatic Research Program in Genetical Data Science.
Huminiecki L Entropy (Basel). 2022; 24(1).
PMID: 35052043 PMC: 8774939. DOI: 10.3390/e24010017.
Long-read cDNA sequencing identifies functional pseudogenes in the human transcriptome.
Troskie R, Jafrani Y, Mercer T, Ewing A, Faulkner G, Cheetham S Genome Biol. 2021; 22(1):146.
PMID: 33971925 PMC: 8108447. DOI: 10.1186/s13059-021-02369-0.
Complex Analysis of Retroposed Genes' Contribution to Human Genome, Proteome and Transcriptome.
Kubiak M, Szczesniak M, Makalowska I Genes (Basel). 2020; 11(5).
PMID: 32408516 PMC: 7290577. DOI: 10.3390/genes11050542.
Overcoming challenges and dogmas to understand the functions of pseudogenes.
Cheetham S, Faulkner G, Dinger M Nat Rev Genet. 2019; 21(3):191-201.
PMID: 31848477 DOI: 10.1038/s41576-019-0196-1.