Inferring Binding Energies from Selected Binding Sites
Overview
Authors
Affiliations
We employ a biophysical model that accounts for the non-linear relationship between binding energy and the statistics of selected binding sites. The model includes the chemical potential of the transcription factor, non-specific binding affinity of the protein for DNA, as well as sequence-specific parameters that may include non-independent contributions of bases to the interaction. We obtain maximum likelihood estimates for all of the parameters and compare the results to standard probabilistic methods of parameter estimation. On simulated data, where the true energy model is known and samples are generated with a variety of parameter values, we show that our method returns much more accurate estimates of the true parameters and much better predictions of the selected binding site distributions. We also introduce a new high-throughput SELEX (HT-SELEX) procedure to determine the binding specificity of a transcription factor in which the initial randomized library and the selected sites are sequenced with next generation methods that return hundreds of thousands of sites. We show that after a single round of selection our method can estimate binding parameters that give very good fits to the selected site distributions, much better than standard motif identification algorithms.
Schroeder J, Wolfe M, Freddolino L bioRxiv. 2025; .
PMID: 39975017 PMC: 11838363. DOI: 10.1101/2025.01.28.635290.
Active learning of enhancers and silencers in the developing neural retina.
Friedman R, Ramu A, Lichtarge S, Wu Y, Tripp L, Lyon D Cell Syst. 2025; 16(1):101163.
PMID: 39778579 PMC: 11827711. DOI: 10.1016/j.cels.2024.12.004.
Geometric deep learning of protein-DNA binding specificity.
Mitra R, Li J, Sagendorf J, Jiang Y, Cohen A, Chiu T Nat Methods. 2024; 21(9):1674-1683.
PMID: 39103447 PMC: 11399107. DOI: 10.1038/s41592-024-02372-w.
Active learning of enhancer and silencer regulatory grammar in photoreceptors.
Friedman R, Ramu A, Lichtarge S, Myers C, Granas D, Gause M bioRxiv. 2023; .
PMID: 37662358 PMC: 10473580. DOI: 10.1101/2023.08.21.554146.
Physicochemical models of protein-DNA binding with standard and modified base pairs.
Chiu T, Rao S, Rohs R Proc Natl Acad Sci U S A. 2023; 120(4):e2205796120.
PMID: 36656856 PMC: 9942898. DOI: 10.1073/pnas.2205796120.