» Articles » PMID: 33413545

An Interpretable Bimodal Neural Network Characterizes the Sequence and Preexisting Chromatin Predictors of Induced Transcription Factor Binding

Overview
Journal Genome Biol
Specialties Biology
Genetics
Date 2021 Jan 8
PMID 33413545
Citations 10
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Transcription factor (TF) binding specificity is determined via a complex interplay between the transcription factor's DNA binding preference and cell type-specific chromatin environments. The chromatin features that correlate with transcription factor binding in a given cell type have been well characterized. For instance, the binding sites for a majority of transcription factors display concurrent chromatin accessibility. However, concurrent chromatin features reflect the binding activities of the transcription factor itself and thus provide limited insight into how genome-wide TF-DNA binding patterns became established in the first place. To understand the determinants of transcription factor binding specificity, we therefore need to examine how newly activated transcription factors interact with sequence and preexisting chromatin landscapes.

Results: Here, we investigate the sequence and preexisting chromatin predictors of TF-DNA binding by examining the genome-wide occupancy of transcription factors that have been induced in well-characterized chromatin environments. We develop Bichrom, a bimodal neural network that jointly models sequence and preexisting chromatin data to interpret the genome-wide binding patterns of induced transcription factors. We find that the preexisting chromatin landscape is a differential global predictor of TF-DNA binding; incorporating preexisting chromatin features improves our ability to explain the binding specificity of some transcription factors substantially, but not others. Furthermore, by analyzing site-level predictors, we show that transcription factor binding in previously inaccessible chromatin tends to correspond to the presence of more favorable cognate DNA sequences.

Conclusions: Bichrom thus provides a framework for modeling, interpreting, and visualizing the joint sequence and chromatin landscapes that determine TF-DNA binding dynamics.

Citing Articles

Improving the generalization of protein expression models with mechanistic sequence information.

Shen Y, Kudla G, Oyarzun D Nucleic Acids Res. 2025; 53(3).

PMID: 39873269 PMC: 11773361. DOI: 10.1093/nar/gkaf020.


Neonatal apnea and hypopnea prediction in infants with Robin sequence with neural additive models for time series.

Vetter J, Lim K, Dijkstra T, Dargaville P, Kohlbacher O, Macke J PLOS Digit Health. 2024; 3(12):e0000678.

PMID: 39671454 PMC: 11642933. DOI: 10.1371/journal.pdig.0000678.


Applying interpretable machine learning in computational biology-pitfalls, recommendations and opportunities for new developments.

Chen V, Yang M, Cui W, Kim J, Talwalkar A, Ma J Nat Methods. 2024; 21(8):1454-1461.

PMID: 39122941 PMC: 11348280. DOI: 10.1038/s41592-024-02359-7.


Systematic dissection of sequence features affecting binding specificity of a pioneer factor reveals binding synergy between FOXA1 and AP-1.

Xu C, Kleinschmidt H, Yang J, Leith E, Johnson J, Tan S Mol Cell. 2024; 84(15):2838-2855.e10.

PMID: 39019045 PMC: 11334613. DOI: 10.1016/j.molcel.2024.06.022.


Predicting gene expression responses to environment in using natural variation in DNA sequence.

Takou M, Bellis E, Lasky J bioRxiv. 2024; .

PMID: 38712066 PMC: 11071634. DOI: 10.1101/2024.04.25.591174.


References
1.
Arvey A, Agius P, Noble W, Leslie C . Sequence and chromatin determinants of cell-type-specific transcription factor binding. Genome Res. 2012; 22(9):1723-34. PMC: 3431489. DOI: 10.1101/gr.127712.111. View

2.
Yamada N, Lai W, Farrell N, Pugh B, Mahony S . Characterizing protein-DNA binding event subtypes in ChIP-exo data. Bioinformatics. 2018; 35(6):903-913. PMC: 6419906. DOI: 10.1093/bioinformatics/bty703. View

3.
Zhang D, Kabuka M . Multimodal deep representation learning for protein interaction identification and protein family classification. BMC Bioinformatics. 2019; 20(Suppl 16):531. PMC: 6886253. DOI: 10.1186/s12859-019-3084-y. View

4.
Guertin M, Lis J . Chromatin landscape dictates HSF binding to target DNA elements. PLoS Genet. 2010; 6(9):e1001114. PMC: 2936546. DOI: 10.1371/journal.pgen.1001114. View

5.
Ernst J, Kellis M . ChromHMM: automating chromatin-state discovery and characterization. Nat Methods. 2012; 9(3):215-6. PMC: 3577932. DOI: 10.1038/nmeth.1906. View