» Articles » PMID: 22408194

Deriving Transcriptional Programs and Functional Processes from Gene Expression Databases

Overview
Journal Bioinformatics
Specialty Biology
Date 2012 Mar 13
PMID 22408194
Citations 1
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: A system-wide approach to revealing the underlying molecular state of a cell is a long-standing biological challenge. Developed over the last decade, gene expression profiles possess the characteristics of such an assay. They have the capacity to reveal both underlying molecular events as well as broader phenotypes such as clinical outcomes. To interpret these profiles, many gene sets have been developed that characterize biological processes. However, the full potential of these gene sets has not yet been achieved. Since the advent of gene expression databases, many have posited that they can reveal properties of activities that are not evident from individual datasets, analogous to how the expression of a single gene generally cannot reveal the activation of a biological process.

Results: To address this issue, we have developed a high-throughput method to mine gene expression databases for the regulation of gene sets. Given a set of genes, we scored it against each gene expression dataset by looking for enrichment of co-regulated genes relative to an empirical null distribution. After validating the method, we applied it to address two biological problems. First, we deciphered the E2F transcriptional network. We confirmed that true transcriptional targets exhibit a distinct regulatory profile across a database. Second, we leveraged the patterns of regulation across a database of gene sets to produce an automatically generated catalog of biological processes. These demonstrations revealed the power of a global analysis of the data contained within gene expression databases, and the potential for using them to address biological questions.

Citing Articles

CTen: a web-based platform for identifying enriched cell types from heterogeneous microarray data.

Shoemaker J, Lopes T, Ghosh S, Matsuoka Y, Kawaoka Y, Kitano H BMC Genomics. 2012; 13:460.

PMID: 22953731 PMC: 3473317. DOI: 10.1186/1471-2164-13-460.

References
1.
Perou C, Sorlie T, Eisen M, van de Rijn M, Jeffrey S, Rees C . Molecular portraits of human breast tumours. Nature. 2000; 406(6797):747-52. DOI: 10.1038/35021093. View

2.
Ashburner M, Ball C, Blake J, Botstein D, Butler H, Cherry J . Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000; 25(1):25-9. PMC: 3037419. DOI: 10.1038/75556. View

3.
Barrett T, Troup D, Wilhite S, Ledoux P, Evangelista C, Kim I . NCBI GEO: archive for functional genomics data sets--10 years on. Nucleic Acids Res. 2010; 39(Database issue):D1005-10. PMC: 3013736. DOI: 10.1093/nar/gkq1184. View

4.
Zhu W, Giangrande P, Nevins J . E2Fs link the control of G1/S and G2/M transcription. EMBO J. 2004; 23(23):4615-26. PMC: 533046. DOI: 10.1038/sj.emboj.7600459. View

5.
Sircoulomb F, Bekhouche I, Finetti P, Adelaide J, Ben Hamida A, Bonansea J . Genome profiling of ERBB2-amplified breast cancers. BMC Cancer. 2010; 10:539. PMC: 2958950. DOI: 10.1186/1471-2407-10-539. View