» Articles » PMID: 35606422

Prediction of Protein-ligand Binding Affinity from Sequencing Data with Interpretable Machine Learning

Abstract

Protein-ligand interactions are increasingly profiled at high throughput using affinity selection and massively parallel sequencing. However, these assays do not provide the biophysical parameters that most rigorously quantify molecular interactions. Here we describe a flexible machine learning method, called ProBound, that accurately defines sequence recognition in terms of equilibrium binding constants or kinetic rates. This is achieved using a multi-layered maximum-likelihood framework that models both the molecular interactions and the data generation process. We show that ProBound quantifies transcription factor (TF) behavior with models that predict binding affinity over a range exceeding that of previous resources; captures the impact of DNA modifications and conformational flexibility of multi-TF complexes; and infers specificity directly from in vivo data such as ChIP-seq without peak calling. When coupled with an assay called K-seq, it determines the absolute affinity of protein-ligand interactions. We also apply ProBound to profile the kinetics of kinase-substrate interactions. ProBound opens new avenues for decoding biological networks and rationally engineering protein-ligand interactions.

Citing Articles

Cell2fate infers RNA velocity modules to improve cell fate prediction.

Aivazidis A, Memi F, Kleshchevnikov V, Er S, Clarke B, Stegle O Nat Methods. 2025; .

PMID: 40032996 DOI: 10.1038/s41592-025-02608-3.


Accurate sequence-to-affinity models for SH2 domains from multi-round peptide binding assays coupled with free-energy regression.

Gagoski D, Rube T, Rube H, Rastogi C, Melo L, Melo L bioRxiv. 2025; .

PMID: 39764007 PMC: 11703206. DOI: 10.1101/2024.12.23.630085.


Targeting serotonin receptors with phytochemicals - an in-silico study.

Elalouf A, Rosenfeld A, Maoz H Sci Rep. 2024; 14(1):30307.

PMID: 39638796 PMC: 11621125. DOI: 10.1038/s41598-024-76329-6.


Perspectives on Codebook: sequence specificity of uncharacterized human transcription factors.

Jolma A, Laverty K, Fathi A, Yang A, Yellan I, Vorontsov I bioRxiv. 2024; .

PMID: 39605729 PMC: 11601247. DOI: 10.1101/2024.11.11.622097.


Identification of methylation-sensitive human transcription factors using meSMiLE-seq.

Gralak A, Faltejskova K, Yang A, Steiner C, Russeil J, Grenningloh N bioRxiv. 2024; .

PMID: 39605503 PMC: 11601298. DOI: 10.1101/2024.11.11.619598.


References
1.
Crocker J, Abe N, Rinaldi L, McGregor A, Frankel N, Wang S . Low affinity binding site clusters confer hox specificity and regulatory robustness. Cell. 2015; 160(1-2):191-203. PMC: 4449256. DOI: 10.1016/j.cell.2014.11.041. View

2.
Farley E, Olson K, Zhang W, Brandt A, Rokhsar D, Levine M . Suboptimization of developmental enhancers. Science. 2015; 350(6258):325-8. PMC: 4970741. DOI: 10.1126/science.aac6948. View

3.
Tanay A . Extensive low-affinity transcriptional interactions in the yeast genome. Genome Res. 2006; 16(8):962-72. PMC: 1524868. DOI: 10.1101/gr.5113606. View

4.
Zykovich A, Korf I, Segal D . Bind-n-Seq: high-throughput analysis of in vitro protein-DNA interactions using massively parallel sequencing. Nucleic Acids Res. 2009; 37(22):e151. PMC: 2794170. DOI: 10.1093/nar/gkp802. View

5.
Jolma A, Kivioja T, Toivonen J, Cheng L, Wei G, Enge M . Multiplexed massively parallel SELEX for characterization of human transcription factor binding specificities. Genome Res. 2010; 20(6):861-73. PMC: 2877582. DOI: 10.1101/gr.100552.109. View