» Articles » PMID: 21165171

Bayesian Modeling of MPSS Data: Gene Expression Analysis of Bovine Salmonella Infection

Overview
Journal J Am Stat Assoc
Specialty Public Health
Date 2010 Dec 18
PMID 21165171
Citations 2
Authors
Affiliations
Soon will be listed here.
Abstract

Massively Parallel Signature Sequencing (MPSS) is a high-throughput counting-based technology available for gene expression profiling. It produces output that is similar to Serial Analysis of Gene Expression (SAGE) and is ideal for building complex relational databases for gene expression. Our goal is to compare the in vivo global gene expression profiles of tissues infected with different strains of Salmonella obtained using the MPSS technology. In this article, we develop an exact ANOVA type model for this count data using a zero-inflated Poisson (ZIP) distribution, different from existing methods that assume continuous densities. We adopt two Bayesian hierarchical models-one parametric and the other semiparametric with a Dirichlet process prior that has the ability to "borrow strength" across related signatures, where a signature is a specific arrangement of the nucleotides, usually 16-21 base-pairs long. We utilize the discreteness of Dirichlet process prior to cluster signatures that exhibit similar differential expression profiles. Tests for differential expression are carried out using non-parametric approaches, while controlling the false discovery rate. We identify several differentially expressed genes that have important biological significance and conclude with a summary of the biological discoveries.

Citing Articles

Feature selection of gene expression data for Cancer classification using double RBF-kernels.

Liu S, Xu C, Zhang Y, Liu J, Yu B, Liu X BMC Bioinformatics. 2018; 19(1):396.

PMID: 30373514 PMC: 6206917. DOI: 10.1186/s12859-018-2400-2.


A Bayesian Semi-parametric Approach for the Differential Analysis of Sequence Counts Data.

Guindani M, Sepulveda N, Paulino C, Muller P J R Stat Soc Ser C Appl Stat. 2014; 63(3):385-404.

PMID: 24833809 PMC: 4017673. DOI: 10.1111/rssc.12041.

References
1.
Bochkina N, Richardson S . Tail posterior probability for inference in pairwise and multiclass gene expression data. Biometrics. 2007; 63(4):1117-25. DOI: 10.1111/j.1541-0420.2007.00807.x. View

2.
Brenner S, Johnson M, Bridgham J, Golda G, Lloyd D, Johnson D . Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nat Biotechnol. 2000; 18(6):630-4. DOI: 10.1038/76469. View

3.
Horiuchi A, Williams K, Kurihara T, Nairn A, Greengard P . Purification and cDNA cloning of ARPP-16, a cAMP-regulated phosphoprotein enriched in basal ganglia, and of a related phosphoprotein, ARPP-19. J Biol Chem. 1990; 265(16):9476-84. View

4.
Reinartz J, Bruyns E, Lin J, Burcham T, Brenner S, Bowen B . Massively parallel signature sequencing (MPSS) as a tool for in-depth quantitative gene expression profiling in all organisms. Brief Funct Genomic Proteomic. 2004; 1(1):95-104. DOI: 10.1093/bfgp/1.1.95. View

5.
Storey J, Tibshirani R . Statistical methods for identifying differentially expressed genes in DNA microarrays. Methods Mol Biol. 2003; 224:149-57. DOI: 10.1385/1-59259-364-X:149. View