» Articles » PMID: 23665510

A Bi-Poisson Model for Clustering Gene Expression Profiles by RNA-seq

Overview
Journal Brief Bioinform
Specialty Biology
Date 2013 May 14
PMID 23665510
Citations 4
Authors
Affiliations
Soon will be listed here.
Abstract

With the availability of gene expression data by RNA-seq, powerful statistical approaches for grouping similar gene expression profiles across different environments have become increasingly important. We describe and assess a computational model for clustering genes into distinct groups based on the pattern of gene expression in response to changing environment. The model capitalizes on the Poisson distribution to capture the count property of RNA-seq data. A two-stage hierarchical expectation–maximization (EM) algorithm is implemented to estimate an optimal number of groups and mean expression amounts of each group across two environments. A procedure is formulated to test whether and how a given group shows a plastic response to environmental changes. The impact of gene–environment interactions on the phenotypic plasticity of the organism can also be visualized and characterized. The model was used to analyse an RNA-seq dataset measured from two cell lines of breast cancer that respond differently to an anti-cancer drug, from which genes associated with the resistance and sensitivity of the cell lines are identified. We performed simulation studies to validate the statistical behaviour of the model. The model provides a useful tool for clustering gene expression data by RNA-seq, facilitating our understanding of gene functions and networks.

Citing Articles

A block mixture model to map eQTLs for gene clustering and networking.

Wang N, Gosik K, Li R, Lindsay B, Wu R Sci Rep. 2016; 6:21193.

PMID: 26892775 PMC: 4759821. DOI: 10.1038/srep21193.


DGEclust: differential expression analysis of clustered count data.

Vavoulis D, Francescatto M, Heutink P, Gough J Genome Biol. 2015; 16:39.

PMID: 25853652 PMC: 4365804. DOI: 10.1186/s13059-015-0604-6.


Modeling Expression Plasticity of Genes that Differentiate Drug-sensitive from Drug-resistant Cells to Chemotherapeutic Treatment.

Wang N, Wang Y, Han H, Huber K, Yang J, Li R Curr Genomics. 2014; 15(5):349-56.

PMID: 25435798 PMC: 4245695. DOI: 10.2174/138920291505141106102854.


A skellam model to identify differential patterns of gene expression induced by environmental signals.

Jiang L, Mao K, Wu R BMC Genomics. 2014; 15:772.

PMID: 25199446 PMC: 4167515. DOI: 10.1186/1471-2164-15-772.

References
1.
Ouyang Z, Zhou Q, Wong W . ChIP-Seq of transcription factors predicts absolute and differential gene expression in embryonic stem cells. Proc Natl Acad Sci U S A. 2009; 106(51):21521-6. PMC: 2789751. DOI: 10.1073/pnas.0904863106. View

2.
Robinson M, Smyth G . Small-sample estimation of negative binomial dispersion, with applications to SAGE data. Biostatistics. 2007; 9(2):321-32. DOI: 10.1093/biostatistics/kxm030. View

3.
Dennis Jr G, Sherman B, Hosack D, Yang J, Gao W, Lane H . DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol. 2003; 4(5):P3. View

4.
Wang Z, Wang Y, Wang N, Wang J, Wang Z, Vallejos C . Towards a comprehensive picture of the genetic landscape of complex traits. Brief Bioinform. 2012; 15(1):30-42. PMC: 3896925. DOI: 10.1093/bib/bbs049. View

5.
Robinson M, Oshlack A . A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 2010; 11(3):R25. PMC: 2864565. DOI: 10.1186/gb-2010-11-3-r25. View