Using Traditional Machine Learning and Deep Learning Methods for On- and Off-target Prediction in CRISPR/Cas9: a Review
Overview
Affiliations
CRISPR/Cas9 (Clustered Regularly Interspaced Short Palindromic Repeats and CRISPR-associated protein 9) is a popular and effective two-component technology used for targeted genetic manipulation. It is currently the most versatile and accurate method of gene and genome editing, which benefits from a large variety of practical applications. For example, in biomedicine, it has been used in research related to cancer, virus infections, pathogen detection, and genetic diseases. Current CRISPR/Cas9 research is based on data-driven models for on- and off-target prediction as a cleavage may occur at non-target sequence locations. Nowadays, conventional machine learning and deep learning methods are applied on a regular basis to accurately predict on-target knockout efficacy and off-target profile of given single-guide RNAs (sgRNAs). In this paper, we present an overview and a comparative analysis of traditional machine learning and deep learning models used in CRISPR/Cas9. We highlight the key research challenges and directions associated with target activity prediction. We discuss recent advances in the sgRNA-DNA sequence encoding used in state-of-the-art on- and off-target prediction models. Furthermore, we present the most popular deep learning neural network architectures used in CRISPR/Cas9 prediction models. Finally, we summarize the existing challenges and discuss possible future investigations in the field of on- and off-target prediction. Our paper provides valuable support for academic and industrial researchers interested in the application of machine learning methods in the field of CRISPR/Cas9 genome editing.
Gene therapy for genetic diseases: challenges and future directions.
Qie B, Tuo J, Chen F, Ding H, Lyu L MedComm (2020). 2025; 6(2):e70091.
PMID: 39949979 PMC: 11822459. DOI: 10.1002/mco2.70091.
Abbasi A, Asim M, Dengel A J Transl Med. 2025; 23(1):153.
PMID: 39905452 PMC: 11796103. DOI: 10.1186/s12967-024-06013-w.
Sari O, Liu Z, Pan Y, Shao X Bioinform Adv. 2025; 5(1):vbae184.
PMID: 39758829 PMC: 11696696. DOI: 10.1093/bioadv/vbae184.
DeepMEns: an ensemble model for predicting sgRNA on-target activity based on multiple features.
Ding S, Zheng J, Jia C Brief Funct Genomics. 2024; 24.
PMID: 39528429 PMC: 11735754. DOI: 10.1093/bfgp/elae043.
Balanced Training Sets Improve Deep Learning-Based Prediction of CRISPR sgRNA Activity.
Trivedi V, Mohseni A, Lonardi S, Wheeldon I ACS Synth Biol. 2024; 13(11):3774-3781.
PMID: 39495623 PMC: 11574921. DOI: 10.1021/acssynbio.4c00542.