» Articles » PMID: 39227603

Codon Usage and Expression-based Features Significantly Improve Prediction of CRISPR Efficiency

Overview
Specialty Biology
Date 2024 Sep 3
PMID 39227603
Authors
Affiliations
Soon will be listed here.
Abstract

CRISPR is a precise and effective genome editing technology; but despite several advancements during the last decade, our ability to computationally design gRNAs remains limited. Most predictive models have relatively low predictive power and utilize only the sequence of the target site as input. Here we suggest a new category of features, which incorporate the target site genomic position and the presence of genes close to it. We calculate four features based on gene expression and codon usage bias indices. We show, on CRISPR datasets taken from 3 different cell types, that such features perform comparably with 425 state-of-the-art predictive features, ranking in the top 2-12% of features. We trained new predictive models, showing that adding expression features to them significantly improves their r by up to 0.04 (relative increase of 39%), achieving average correlations of up to 0.38 on their validation sets; and that these features are deemed important by different feature importance metrics. We believe that incorporating the target site's position, in addition to its sequence, in features such as we have generated here will improve our ability to predict, design and understand CRISPR experiments going forward.

References
1.
Sternberg S, Redding S, Jinek M, Greene E, Doudna J . DNA interrogation by the CRISPR RNA-guided endonuclease Cas9. Nature. 2014; 507(7490):62-7. PMC: 4106473. DOI: 10.1038/nature13011. View

2.
Dos Reis M, Savva R, Wernisch L . Solving the riddle of codon usage preferences: a test for translational selection. Nucleic Acids Res. 2004; 32(17):5036-44. PMC: 521650. DOI: 10.1093/nar/gkh834. View

3.
Lin J, Wong K . Off-target predictions in CRISPR-Cas9 gene editing using deep learning. Bioinformatics. 2018; 34(17):i656-i663. PMC: 6129261. DOI: 10.1093/bioinformatics/bty554. View

4.
Kaur K, Gupta A, Rajput A, Kumar M . ge-CRISPR - An integrated pipeline for the prediction and analysis of sgRNAs genome editing efficiency for CRISPR/Cas system. Sci Rep. 2016; 6:30870. PMC: 5007494. DOI: 10.1038/srep30870. View

5.
Xie S, Shen B, Zhang C, Huang X, Zhang Y . sgRNAcas9: a software package for designing CRISPR sgRNA and evaluating potential off-target cleavage sites. PLoS One. 2014; 9(6):e100448. PMC: 4067335. DOI: 10.1371/journal.pone.0100448. View