» Articles » PMID: 39164757

Current Limitations in Predicting MRNA Translation with Deep Learning Models

Overview
Journal Genome Biol
Specialties Biology
Genetics
Date 2024 Aug 20
PMID 39164757
Authors
Affiliations
Soon will be listed here.
Abstract

Background: The design of nucleotide sequences with defined properties is a long-standing problem in bioengineering. An important application is protein expression, be it in the context of research or the production of mRNA vaccines. The rate of protein synthesis depends on the 5' untranslated region (5'UTR) of the mRNAs, and recently, deep learning models were proposed to predict the translation output of mRNAs from the 5'UTR sequence. At the same time, large data sets of endogenous and reporter mRNA translation have become available.

Results: In this study, we use complementary data obtained in two different cell types to assess the accuracy and generality of currently available models for predicting translational output. We find that while performing well on the data sets on which they were trained, deep learning models do not generalize well to other data sets, in particular of endogenous mRNAs, which differ in many properties from reporter constructs.

Conclusions: These differences limit the ability of deep learning models to uncover mechanisms of translation control and to predict the impact of genetic variation. We suggest directions that combine high-throughput measurements and machine learning to unravel mechanisms of translation control and improve construct design.

Citing Articles

UTR-Insight: integrating deep learning for efficient 5' UTR discovery and design.

Pan S, Wang H, Zhang H, Tang Z, Xu L, Yan Z BMC Genomics. 2025; 26(1):107.

PMID: 39905334 PMC: 11796101. DOI: 10.1186/s12864-025-11269-7.


Improving the generalization of protein expression models with mechanistic sequence information.

Shen Y, Kudla G, Oyarzun D Nucleic Acids Res. 2025; 53(3).

PMID: 39873269 PMC: 11773361. DOI: 10.1093/nar/gkaf020.


Interpreting deep neural networks for the prediction of translation rates.

Korbel F, Eroshok E, Ohler U BMC Genomics. 2024; 25(1):1061.

PMID: 39522049 PMC: 11549864. DOI: 10.1186/s12864-024-10925-8.


Current limitations in predicting mRNA translation with deep learning models.

Schlusser N, Gonzalez A, Pandey M, Zavolan M Genome Biol. 2024; 25(1):227.

PMID: 39164757 PMC: 11337900. DOI: 10.1186/s13059-024-03369-6.


Predicting the translation efficiency of messenger RNA in mammalian cells.

Zheng D, Persyn L, Wang J, Liu Y, Montoya F, Cenik C bioRxiv. 2024; .

PMID: 39149337 PMC: 11326250. DOI: 10.1101/2024.08.11.607362.


References
1.
Kozak M . An analysis of 5'-noncoding sequences from 699 vertebrate messenger RNAs. Nucleic Acids Res. 1987; 15(20):8125-48. PMC: 306349. DOI: 10.1093/nar/15.20.8125. View

2.
Banerjee A, Ataman M, Smialek M, Mookherjee D, Rabl J, Mironov A . Ribosomal protein RPL39L is an efficiency factor in the cotranslational folding of a subset of proteins with alpha helical domains. Nucleic Acids Res. 2024; 52(15):9028-9048. PMC: 11347166. DOI: 10.1093/nar/gkae630. View

3.
Ingolia N, Brar G, Rouskin S, McGeachy A, Weissman J . The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments. Nat Protoc. 2012; 7(8):1534-50. PMC: 3535016. DOI: 10.1038/nprot.2012.086. View

4.
Niederer R, Rojas-Duran M, Zinshteyn B, Gilbert W . Direct analysis of ribosome targeting illuminates thousand-fold regulation of translation initiation. Cell Syst. 2022; 13(3):256-264.e3. PMC: 8930539. DOI: 10.1016/j.cels.2021.12.002. View

5.
Arava Y, Wang Y, Storey J, Liu C, Brown P, Herschlag D . Genome-wide analysis of mRNA translation profiles in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A. 2003; 100(7):3889-94. PMC: 153018. DOI: 10.1073/pnas.0635171100. View