GenerRNA: A Generative Pre-trained Language Model for De Novo RNA Design
Overview
Affiliations
The design of RNA plays a crucial role in developing RNA vaccines, nucleic acid therapeutics, and innovative biotechnological tools. However, existing techniques frequently lack versatility across various tasks and are dependent on pre-defined secondary structure or other prior knowledge. To address these limitations, we introduce GenerRNA, a Transformer-based model inspired by the success of large language models (LLMs) in protein and molecule generation. GenerRNA is pre-trained on large-scale RNA sequences and capable of generating novel RNA sequences with stable secondary structures, while ensuring distinctiveness from existing sequences, thereby expanding our exploration of the RNA space. Moreover, GenerRNA can be fine-tuned on smaller, specialized datasets for specific subtasks, enabling the generation of RNAs with desired functionalities or properties without requiring any prior knowledge input. As a demonstration, we fine-tuned GenerRNA and successfully generated novel RNA sequences exhibiting high affinity for target proteins. Our work is the first application of a generative language model to RNA generation, presenting an innovative approach to RNA design.
Foundation models in bioinformatics.
Guo F, Guan R, Li Y, Liu Q, Wang X, Yang C Natl Sci Rev. 2025; 12(4):nwaf028.
PMID: 40078374 PMC: 11900445. DOI: 10.1093/nsr/nwaf028.
Asim M, Ibrahim M, Asif T, Dengel A Heliyon. 2025; 11(2):e41488.
PMID: 39897847 PMC: 11783440. DOI: 10.1016/j.heliyon.2024.e41488.
RNA language models predict mutations that improve RNA function.
Shulgina Y, Trinidad M, Langeberg C, Nisonoff H, Chithrananda S, Skopintsev P Nat Commun. 2024; 15(1):10627.
PMID: 39638800 PMC: 11621547. DOI: 10.1038/s41467-024-54812-y.