» Articles » PMID: 38296821

Optimizing Sequence Design Strategies for Perturbation MPRAs: a Computational Evaluation Framework

Overview
Specialty Biochemistry
Date 2024 Jan 31
PMID 38296821
Authors
Affiliations
Soon will be listed here.
Abstract

The advent of perturbation-based massively parallel reporter assays (MPRAs) technique has facilitated the delineation of the roles of non-coding regulatory elements in orchestrating gene expression. However, computational efforts remain scant to evaluate and establish guidelines for sequence design strategies for perturbation MPRAs. In this study, we propose a framework for evaluating and comparing various perturbation strategies for MPRA experiments. Within this framework, we benchmark three different perturbation approaches from the perspectives of alteration in motif-based profiles, consistency of MPRA outputs, and robustness of models that predict the activities of putative regulatory motifs. While our analyses show very similar results across multiple benchmarking metrics, the predictive modeling for the approach involving random nucleotide shuffling shows significant robustness compared with the other two approaches. Thus, we recommend designing sequences by randomly shuffling the nucleotides of the perturbed site in perturbation-MPRA, followed by a coherence check to prevent the introduction of other variations of the target motifs. In summary, our evaluation framework and the benchmarking findings create a resource of computational pipelines and highlight the potential of perturbation-MPRA in predicting non-coding regulatory activities.

Citing Articles

Comprehensive network modeling approaches unravel dynamic enhancer-promoter interactions across neural differentiation.

DeGroat W, Inoue F, Ashuach T, Yosef N, Ahituv N, Kreimer A Genome Biol. 2024; 25(1):221.

PMID: 39143563 PMC: 11323586. DOI: 10.1186/s13059-024-03365-w.


Comprehensive network modeling approaches unravel dynamic enhancer-promoter interactions across neural differentiation.

DeGroat W, Inoue F, Ashuach T, Yosef N, Ahituv N, Kreimer A bioRxiv. 2024; .

PMID: 38826254 PMC: 11142193. DOI: 10.1101/2024.05.22.595375.

References
1.
Kreimer A, Zeng H, Edwards M, Guo Y, Tian K, Shin S . Predicting gene expression in massively parallel reporter assays: A comparative study. Hum Mutat. 2017; 38(9):1240-1250. PMC: 5560998. DOI: 10.1002/humu.23197. View

2.
Zhou T, Yang L, Lu Y, Dror I, Dantas Machado A, Ghane T . DNAshape: a method for the high-throughput prediction of DNA structural features on a genomic scale. Nucleic Acids Res. 2013; 41(Web Server issue):W56-62. PMC: 3692085. DOI: 10.1093/nar/gkt437. View

3.
Kwasnieski J, Fiore C, Chaudhari H, Cohen B . High-throughput functional testing of ENCODE segmentation predictions. Genome Res. 2014; 24(10):1595-602. PMC: 4199366. DOI: 10.1101/gr.173518.114. View

4.
Patwardhan R, Hiatt J, Witten D, Kim M, Smith R, May D . Massively parallel functional dissection of mammalian enhancers in vivo. Nat Biotechnol. 2012; 30(3):265-70. PMC: 3402344. DOI: 10.1038/nbt.2136. View

5.
Tewhey R, Kotliar D, Park D, Liu B, Winnicki S, Reilly S . Direct Identification of Hundreds of Expression-Modulating Variants using a Multiplexed Reporter Assay. Cell. 2016; 165(6):1519-1529. PMC: 4957403. DOI: 10.1016/j.cell.2016.04.027. View