» Articles » PMID: 22955983

Predicting Cell-type-specific Gene Expression from Regions of Open Chromatin

Overview
Journal Genome Res
Specialty Genetics
Date 2012 Sep 8
PMID 22955983
Citations 146
Authors
Affiliations
Soon will be listed here.
Abstract

Complex patterns of cell-type-specific gene expression are thought to be achieved by combinatorial binding of transcription factors (TFs) to sequence elements in regulatory regions. Predicting cell-type-specific expression in mammals has been hindered by the oftentimes unknown location of distal regulatory regions. To alleviate this bottleneck, we used DNase-seq data from 19 diverse human cell types to identify proximal and distal regulatory elements at genome-wide scale. Matched expression data allowed us to separate genes into classes of cell-type-specific up-regulated, down-regulated, and constitutively expressed genes. CG dinucleotide content and DNA accessibility in the promoters of these three classes of genes displayed substantial differences, highlighting the importance of including these aspects in modeling gene expression. We associated DNase I hypersensitive sites (DHSs) with genes, and trained classifiers for different expression patterns. TF sequence motif matches in DHSs provided a strong performance improvement in predicting gene expression over the typical baseline approach of using proximal promoter sequences. In particular, we achieved competitive performance when discriminating up-regulated genes from different cell types or genes up- and down-regulated under the same conditions. We identified previously known and new candidate cell-type-specific regulators. The models generated testable predictions of activating or repressive functions of regulators. DNase I footprints for these regulators were indicative of their direct binding to DNA. In summary, we successfully used information of open chromatin obtained by a single assay, DNase-seq, to address the problem of predicting cell-type-specific gene expression in mammalian organisms directly from regulatory sequence.

Citing Articles

iPSCs and iPSC-derived cells as a model of human genetic and epigenetic variation.

Quaid K, Xing X, Chen Y, Miao Y, Neilson A, Selvamani V Nat Commun. 2025; 16(1):1750.

PMID: 39966349 PMC: 11836351. DOI: 10.1038/s41467-025-56569-4.


Trithorax regulates long-term memory in Drosophila through epigenetic maintenance of mushroom body metabolic state and translation capacity.

Raun N, Jones S, Kerr O, Keung C, Butler E, Alka K PLoS Biol. 2025; 23(1):e3003004.

PMID: 39869640 PMC: 11835295. DOI: 10.1371/journal.pbio.3003004.


A multi-regional human brain atlas of chromatin accessibility and gene expression facilitates promoter-isoform resolution genetic fine-mapping.

Dong P, Song L, Bendl J, Misir R, Shao Z, Edelstien J Nat Commun. 2024; 15(1):10113.

PMID: 39578476 PMC: 11584674. DOI: 10.1038/s41467-024-54448-y.


High-throughput optimized prime editing mediated endogenous protein tagging for pooled imaging of protein localization.

Sanchez H, Lapidot T, Shalem O bioRxiv. 2024; .

PMID: 39345511 PMC: 11429766. DOI: 10.1101/2024.09.16.613361.


Predicting gene expression state and prioritizing putative enhancers using 5hmC signal.

Gonzalez-Avalos E, Onodera A, Samaniego-Castruita D, Rao A, Ay F Genome Biol. 2024; 25(1):142.

PMID: 38825692 PMC: 11145787. DOI: 10.1186/s13059-024-03273-z.


References
1.
Das D, Nahle Z, Zhang M . Adaptively inferring human transcriptional subnetworks. Mol Syst Biol. 2006; 2:2006.0029. PMC: 1681499. DOI: 10.1038/msb4100067. View

2.
Rosa A, Brivanlou A . A regulatory circuitry comprised of miR-302 and the transcription factors OCT4 and NR2F2 regulates human embryonic stem cell differentiation. EMBO J. 2010; 30(2):237-48. PMC: 3025464. DOI: 10.1038/emboj.2010.319. View

3.
Bailey T, Boden M, Whitington T, Machanick P . The value of position-specific priors in motif discovery using MEME. BMC Bioinformatics. 2010; 11:179. PMC: 2868008. DOI: 10.1186/1471-2105-11-179. View

4.
Fan H, Cinar M, Phatsara C, Tesfaye D, Tholen E, Looft C . Molecular mechanism underlying the differential MYF6 expression in postnatal skeletal muscle of Duroc and Pietrain breeds. Gene. 2011; 486(1-2):8-14. DOI: 10.1016/j.gene.2011.06.031. View

5.
Song L, Zhang Z, Grasfeder L, Boyle A, Giresi P, Lee B . Open chromatin defined by DNaseI and FAIRE identifies regulatory elements that shape cell-type identity. Genome Res. 2011; 21(10):1757-67. PMC: 3202292. DOI: 10.1101/gr.121541.111. View