» Articles » PMID: 16672042

A Hidden Markov Model Approach for Determining Expression from Genomic Tiling Micro Arrays

Overview
Publisher Biomed Central
Specialty Biology
Date 2006 May 5
PMID 16672042
Citations 16
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Genomic tiling micro arrays have great potential for identifying previously undiscovered coding as well as non-coding transcription. To-date, however, analyses of these data have been performed in an ad hoc fashion.

Results: We present a probabilistic procedure, ExpressHMM, that adaptively models tiling data prior to predicting expression on genomic sequence. A hidden Markov model (HMM) is used to model the distributions of tiling array probe scores in expressed and non-expressed regions. The HMM is trained on sets of probes mapped to regions of annotated expression and non-expression. Subsequently, prediction of transcribed fragments is made on tiled genomic sequence. The prediction is accompanied by an expression probability curve for visual inspection of the supporting evidence. We test ExpressHMM on data from the Cheng et al. (2005) tiling array experiments on ten Human chromosomes. Results can be downloaded and viewed from our web site.

Conclusion: The value of adaptive modelling of fluorescence scores prior to categorisation into expressed and non-expressed probes is demonstrated. Our results indicate that our adaptive approach is superior to the previous analysis in terms of nucleotide sensitivity and transfrag specificity.

Citing Articles

Absence/presence calling in microarray-based CGH experiments with non-model organisms.

Jonker M, de Leeuw W, Marinkovic M, Wittink F, Rauwerda H, Bruning O Nucleic Acids Res. 2014; 42(11):e94.

PMID: 24771343 PMC: 4066771. DOI: 10.1093/nar/gku343.


Analysis of tiling array expression studies with flexible designs in Bioconductor (waveTiling).

De Beuf K, Pipelers P, Andriankaja M, Thas O, Inze D, Crainiceanu C BMC Bioinformatics. 2012; 13:234.

PMID: 22974078 PMC: 3558343. DOI: 10.1186/1471-2105-13-234.


Bioinformatics tools in predictive ecology: applications to fisheries.

Tucker A, Duplisea D Philos Trans R Soc Lond B Biol Sci. 2011; 367(1586):279-90.

PMID: 22144390 PMC: 3223807. DOI: 10.1098/rstb.2011.0184.


Normalization of high dimensional genomics data where the distribution of the altered variables is skewed.

Landfors M, Philip P, Ryden P, Stenberg P PLoS One. 2011; 6(11):e27942.

PMID: 22132175 PMC: 3222656. DOI: 10.1371/journal.pone.0027942.


Generalizing moving averages for tiling arrays using combined p-value statistics.

Kechris K, Biehs B, Kornberg T Stat Appl Genet Mol Biol. 2010; 9:Article29.

PMID: 20812907 PMC: 2942027. DOI: 10.2202/1544-6115.1434.


References
1.
Selinger D, Cheung K, Mei R, Johansson E, Richmond C, Blattner F . RNA expression analysis using a 30 base pair resolution Escherichia coli genome array. Nat Biotechnol. 2000; 18(12):1262-8. DOI: 10.1038/82367. View

2.
Yamada K, Lim J, Dale J, Chen H, Shinn P, Palm C . Empirical analysis of transcriptional activity in the Arabidopsis genome. Science. 2003; 302(5646):842-6. DOI: 10.1126/science.1088305. View

3.
Griffiths-Jones S . The microRNA Registry. Nucleic Acids Res. 2003; 32(Database issue):D109-11. PMC: 308757. DOI: 10.1093/nar/gkh023. View

4.
Rinn J, Euskirchen G, Bertone P, Martone R, Luscombe N, Hartman S . The transcriptional activity of human Chromosome 22. Genes Dev. 2003; 17(4):529-40. PMC: 195998. DOI: 10.1101/gad.1055203. View

5.
Naef F, Magnasco M . Solving the riddle of the bright mismatches: labeling and effective binding in oligonucleotide arrays. Phys Rev E Stat Nonlin Soft Matter Phys. 2003; 68(1 Pt 1):011906. DOI: 10.1103/PhysRevE.68.011906. View