» Articles » PMID: 20529901

Discovering Transcriptional Modules by Bayesian Data Integration

Overview
Journal Bioinformatics
Specialty Biology
Date 2010 Jun 10
PMID 20529901
Citations 38
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: We present a method for directly inferring transcriptional modules (TMs) by integrating gene expression and transcription factor binding (ChIP-chip) data. Our model extends a hierarchical Dirichlet process mixture model to allow data fusion on a gene-by-gene basis. This encodes the intuition that co-expression and co-regulation are not necessarily equivalent and hence we do not expect all genes to group similarly in both datasets. In particular, it allows us to identify the subset of genes that share the same structure of transcriptional modules in both datasets.

Results: We find that by working on a gene-by-gene basis, our model is able to extract clusters with greater functional coherence than existing methods. By combining gene expression and transcription factor binding (ChIP-chip) data in this way, we are better able to determine the groups of genes that are most likely to represent underlying TMs.

Availability: If interested in the code for the work presented in this article, please contact the authors.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Citing Articles

Bayesian profile regression for clustering analysis involving a longitudinal response and explanatory variables.

Rouanet A, Johnson R, Strauss M, Richardson S, Tom B, White S Methodology (Gott). 2024; 73(2):314-339.

PMID: 38577633 PMC: 7615733. DOI: 10.1093/jrsssc/qlad097.


A Drug Repurposing Pipeline Based on Bladder Cancer Integrated Proteotranscriptomics Signatures.

Mokou M, Narayanasamy S, Stroggilos R, Balaur I, Vlahou A, Mischak H Methods Mol Biol. 2023; 2684:59-99.

PMID: 37410228 DOI: 10.1007/978-1-0716-3291-8_4.


Statistical analysis of high-dimensional biomedical data: a gentle introduction to analytical goals, common approaches and challenges.

Rahnenfuhrer J, De Bin R, Benner A, Ambrogi F, Lusa L, Boulesteix A BMC Med. 2023; 21(1):182.

PMID: 37189125 PMC: 10186672. DOI: 10.1186/s12916-023-02858-y.


Telescoping bimodal latent Dirichlet allocation to identify expression QTLs across tissues.

Gewirtz A, Townes F, Engelhardt B Life Sci Alliance. 2022; 5(12).

PMID: 35977827 PMC: 9387650. DOI: 10.26508/lsa.202101297.


Unsupervised Multi-Omics Data Integration Methods: A Comprehensive Review.

Vahabi N, Michailidis G Front Genet. 2022; 13:854752.

PMID: 35391796 PMC: 8981526. DOI: 10.3389/fgene.2022.854752.


References
1.
Lee T, Rinaldi N, Robert F, Odom D, Bar-Joseph Z, Gerber G . Transcriptional regulatory networks in Saccharomyces cerevisiae. Science. 2002; 298(5594):799-804. DOI: 10.1126/science.1075090. View

2.
Medvedovic M, Sivaganesan S . Bayesian infinite mixture model based clustering of gene expression profiles. Bioinformatics. 2002; 18(9):1194-206. DOI: 10.1093/bioinformatics/18.9.1194. View

3.
Kundaje A, Middendorf M, Gao F, Wiggins C, Leslie C . Combining sequence and time series expression data to learn transcriptional modules. IEEE/ACM Trans Comput Biol Bioinform. 2006; 2(3):194-202. DOI: 10.1109/TCBB.2005.34. View

4.
Eisen M, Spellman P, Brown P, Botstein D . Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A. 1998; 95(25):14863-8. PMC: 24541. DOI: 10.1073/pnas.95.25.14863. View

5.
Gasch A, Spellman P, Kao C, Eisen M, Storz G, Botstein D . Genomic expression programs in the response of yeast cells to environmental changes. Mol Biol Cell. 2000; 11(12):4241-57. PMC: 15070. DOI: 10.1091/mbc.11.12.4241. View