» Articles » PMID: 36018788

PRIESSTESS: Interpretable, High-performing Models of the Sequence and Structure Preferences of RNA-binding Proteins

Overview
Specialty Biochemistry
Date 2022 Aug 26
PMID 36018788
Authors
Affiliations
Soon will be listed here.
Abstract

Modelling both primary sequence and secondary structure preferences for RNA binding proteins (RBPs) remains an ongoing challenge. Current models use varied RNA structure representations and can be difficult to interpret and evaluate. To address these issues, we present a universal RNA motif-finding/scanning strategy, termed PRIESSTESS (Predictive RBP-RNA InterpretablE Sequence-Structure moTif regrESSion), that can be applied to diverse RNA binding datasets. PRIESSTESS identifies dozens of enriched RNA sequence and/or structure motifs that are subsequently reduced to a set of core motifs by logistic regression with LASSO regularization. Importantly, these core motifs are easily visualized and interpreted, and provide a measure of RBP secondary structure specificity. We used PRIESSTESS to interrogate new HTR-SELEX data for 23 RBPs with diverse RNA binding modes and captured known primary sequence and secondary structure preferences for each. Moreover, when applying PRIESSTESS to 144 RBPs across 202 RNA binding datasets, 75% showed an RNA secondary structure preference but only 10% had a preference besides unpaired bases, suggesting that most RBPs simply recognize the accessibility of primary sequences.

Citing Articles

Cross-platform DNA motif discovery and benchmarking to explore binding specificities of poorly studied human transcription factors.

Vorontsov I, Kozin I, Abramov S, Boytsov A, Jolma A, Albu M bioRxiv. 2024; .

PMID: 39605530 PMC: 11601219. DOI: 10.1101/2024.11.11.619379.


GHT-SELEX demonstrates unexpectedly high intrinsic sequence specificity and complex DNA binding of many human transcription factors.

Jolma A, Hernandez-Corchado A, Yang A, Fathi A, Laverty K, Brechalov A bioRxiv. 2024; .

PMID: 39605368 PMC: 11601218. DOI: 10.1101/2024.11.11.618478.


ePRINT: exonuclease assisted mapping of protein-RNA interactions.

Hawkins S, Mondaini A, Namboori S, Nguyen G, Yeo G, Javed A Genome Biol. 2024; 25(1):140.

PMID: 38807229 PMC: 11134894. DOI: 10.1186/s13059-024-03271-1.


Deep Learning for Elucidating Modifications to RNA-Status and Challenges Ahead.

Rennie S Genes (Basel). 2024; 15(5).

PMID: 38790258 PMC: 11121098. DOI: 10.3390/genes15050629.


DeepFusion: A deep bimodal information fusion network for unraveling protein-RNA interactions using in vivo RNA structures.

Qiao Y, Yang R, Liu Y, Chen J, Zhao L, Huo P Comput Struct Biotechnol J. 2024; 23:617-625.

PMID: 38274994 PMC: 10808905. DOI: 10.1016/j.csbj.2023.12.040.


References
1.
Maticzka D, Lange S, Costa F, Backofen R . GraphProt: modeling binding preferences of RNA-binding proteins. Genome Biol. 2014; 15(1):R17. PMC: 4053806. DOI: 10.1186/gb-2014-15-1-r17. View

2.
Cook K, Vembu S, Ha K, Zheng H, Laverty K, Hughes T . RNAcompete-S: Combined RNA sequence/structure preferences for RNA binding proteins derived from a single-step in vitro selection. Methods. 2017; 126:18-28. DOI: 10.1016/j.ymeth.2017.06.024. View

3.
Gerstberger S, Hafner M, Tuschl T . A census of human RNA-binding proteins. Nat Rev Genet. 2014; 15(12):829-45. PMC: 11148870. DOI: 10.1038/nrg3813. View

4.
Tayara H, Chong K . Improved Predicting of The Sequence Specificities of RNA Binding Proteins by Deep Learning. IEEE/ACM Trans Comput Biol Bioinform. 2020; 18(6):2526-2534. DOI: 10.1109/TCBB.2020.2981335. View

5.
Hiller M, Pudimat R, Busch A, Backofen R . Using RNA secondary structures to guide sequence motif finding towards single-stranded regions. Nucleic Acids Res. 2006; 34(17):e117. PMC: 1903381. DOI: 10.1093/nar/gkl544. View