» Articles » PMID: 27084946

DanQ: a Hybrid Convolutional and Recurrent Deep Neural Network for Quantifying the Function of DNA Sequences

Overview
Specialty Biochemistry
Date 2016 Apr 17
PMID 27084946
Citations 306
Authors
Affiliations
Soon will be listed here.
Abstract

Modeling the properties and functions of DNA sequences is an important, but challenging task in the broad field of genomics. This task is particularly difficult for non-coding DNA, the vast majority of which is still poorly understood in terms of function. A powerful predictive model for the function of non-coding DNA can have enormous benefit for both basic science and translational research because over 98% of the human genome is non-coding and 93% of disease-associated variants lie in these regions. To address this need, we propose DanQ, a novel hybrid convolutional and bi-directional long short-term memory recurrent neural network framework for predicting non-coding function de novo from sequence. In the DanQ model, the convolution layer captures regulatory motifs, while the recurrent layer captures long-term dependencies between the motifs in order to learn a regulatory 'grammar' to improve predictions. DanQ improves considerably upon other models across several metrics. For some regulatory markers, DanQ can achieve over a 50% relative improvement in the area under the precision-recall curve metric compared to related models. We have made the source code available at the github repository http://github.com/uci-cbcl/DanQ.

Citing Articles

Enhancer reprogramming: critical roles in cancer and promising therapeutic strategies.

Yang J, Zhou F, Luo X, Fang Y, Wang X, Liu X Cell Death Discov. 2025; 11(1):84.

PMID: 40032852 PMC: 11876437. DOI: 10.1038/s41420-025-02366-3.


Modeling and designing enhancers by introducing and harnessing transcription factor binding units.

Li J, Zhang P, Xi X, Liu L, Wei L, Wang X Nat Commun. 2025; 16(1):1469.

PMID: 39922842 PMC: 11807178. DOI: 10.1038/s41467-025-56749-2.


DNALongBench: A Benchmark Suite for Long-Range DNA Prediction Tasks.

Cheng W, Song Z, Zhang Y, Wang S, Wang D, Yang M bioRxiv. 2025; .

PMID: 39829833 PMC: 11741265. DOI: 10.1101/2025.01.06.631595.


ProPr54 web server: predicting σ promoters and regulon with a hybrid convolutional and recurrent deep neural network.

Achterberg T, de Jong A NAR Genom Bioinform. 2025; 7(1):lqae188.

PMID: 39781509 PMC: 11704786. DOI: 10.1093/nargab/lqae188.


Maize inbreds show allelic variation for diel transcription patterns.

Gage J, Romay M, Buckler E bioRxiv. 2025; .

PMID: 39763849 PMC: 11702552. DOI: 10.1101/2024.12.16.628400.


References
1.
LeCun Y, Bengio Y, Hinton G . Deep learning. Nature. 2015; 521(7553):436-44. DOI: 10.1038/nature14539. View

2.
Alipanahi B, Delong A, Weirauch M, Frey B . Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat Biotechnol. 2015; 33(8):831-8. DOI: 10.1038/nbt.3300. View

3.
Hindorff L, Sethupathy P, Junkins H, Ramos E, Mehta J, Collins F . Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A. 2009; 106(23):9362-7. PMC: 2687147. DOI: 10.1073/pnas.0903103106. View

4.
Kundaje A, Meuleman W, Ernst J, Bilenky M, Yen A, Heravi-Moussavi A . Integrative analysis of 111 reference human epigenomes. Nature. 2015; 518(7539):317-30. PMC: 4530010. DOI: 10.1038/nature14248. View

5.
Mathelier A, Fornes O, Arenillas D, Chen C, Denay G, Lee J . JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 2015; 44(D1):D110-5. PMC: 4702842. DOI: 10.1093/nar/gkv1176. View