» Articles » PMID: 39394483

A Community Effort to Optimize Sequence-based Deep Learning Models of Gene Regulation

Abstract

A systematic evaluation of how model architectures and training strategies impact genomics model performance is needed. To address this gap, we held a DREAM Challenge where competitors trained models on a dataset of millions of random promoter DNA sequences and corresponding expression levels, experimentally determined in yeast. For a robust evaluation of the models, we designed a comprehensive suite of benchmarks encompassing various sequence types. All top-performing models used neural networks but diverged in architectures and training strategies. To dissect how architectural and training choices impact performance, we developed the Prix Fixe framework to divide models into modular building blocks. We tested all possible combinations for the top three models, further improving their performance. The DREAM Challenge models not only achieved state-of-the-art results on our comprehensive yeast dataset but also consistently surpassed existing benchmarks on Drosophila and human genomic datasets, demonstrating the progress that can be driven by gold-standard genomics datasets.

Citing Articles

ChromBPNet: bias factorized, base-resolution deep learning models of chromatin accessibility reveal cis-regulatory sequence syntax, transcription factor footprints and regulatory variants.

Pampari A, Shcherbina A, Kvon E, Kosicki M, Nair S, Kundu S bioRxiv. 2025; .

PMID: 39829783 PMC: 11741299. DOI: 10.1101/2024.12.25.630221.


A generative framework for enhanced cell-type specificity in rationally designed mRNAs.

Khoroshkin M, Zinkevich A, Aristova E, Yousefi H, Lee S, Mittmann T bioRxiv. 2025; .

PMID: 39803435 PMC: 11722239. DOI: 10.1101/2024.12.31.630783.


Predictive Modeling of Gene Expression and Localization of DNA Binding Site Using Deep Convolutional Neural Networks.

Karshenas A, Roschinger T, Garcia H bioRxiv. 2025; .

PMID: 39763851 PMC: 11702772. DOI: 10.1101/2024.12.17.629042.

References
1.
Roeder R . 50+ years of eukaryotic transcription: an expanding universe of factors and mechanisms. Nat Struct Mol Biol. 2019; 26(9):783-791. PMC: 6867066. DOI: 10.1038/s41594-019-0287-x. View

2.
Cramer P . Organization and regulation of gene transcription. Nature. 2019; 573(7772):45-54. DOI: 10.1038/s41586-019-1517-4. View

3.
Furlong E, Levine M . Developmental enhancers and chromosome topology. Science. 2018; 361(6409):1341-1345. PMC: 6986801. DOI: 10.1126/science.aau0320. View

4.
Field A, Adelman K . Evaluating Enhancer Function and Transcription. Annu Rev Biochem. 2020; 89:213-234. DOI: 10.1146/annurev-biochem-011420-095916. View

5.
de Boer C, Taipale J . Hold out the genome: a roadmap to solving the cis-regulatory code. Nature. 2023; 625(7993):41-50. DOI: 10.1038/s41586-023-06661-w. View