» Articles » PMID: 24382981

Applications of Bayesian Gene Selection and Classification with Mixtures of Generalized Singular G-priors

Overview
Publisher Hindawi
Date 2014 Jan 3
PMID 24382981
Authors
Affiliations
Soon will be listed here.
Abstract

Recent advancement in microarray technologies has led to a collection of an enormous number of genetic markers in disease association studies, and yet scientists are interested in selecting a smaller set of genes to explore the relation between genes and disease. Current approaches either adopt a single marker test which ignores the possible interaction among genes or consider a multistage procedure that reduces the large size of genes before evaluation of the association. Among the latter, Bayesian analysis can further accommodate the correlation between genes through the specification of a multivariate prior distribution and estimate the probabilities of association through latent variables. The covariance matrix, however, depends on an unknown parameter. In this research, we suggested a reference hyperprior distribution for such uncertainty, outlined the implementation of its computation, and illustrated this fully Bayesian approach with a colon and leukemia cancer study. Comparison with other existing methods was also conducted. The classification accuracy of our proposed model is higher with a smaller set of selected genes. The results not only replicated findings in several earlier studies, but also provided the strength of association with posterior probabilities.

References
1.
Yang A, Song X . Bayesian variable selection for disease classification using gene expression data. Bioinformatics. 2009; 26(2):215-22. DOI: 10.1093/bioinformatics/btp638. View

2.
Furey T, Cristianini N, Duffy N, Bednarski D, Schummer M, Haussler D . Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics. 2000; 16(10):906-14. DOI: 10.1093/bioinformatics/16.10.906. View

3.
Wang A, Gehan E . Gene selection for microarray data analysis using principal component analysis. Stat Med. 2005; 24(13):2069-87. DOI: 10.1002/sim.2082. View

4.
Li G, Bu H, Yang M, Zeng X, Yang J . Selecting subsets of newly extracted features from PCA and PLS in microarray data analysis. BMC Genomics. 2008; 9 Suppl 2:S24. PMC: 2559889. DOI: 10.1186/1471-2164-9-S2-S24. View

5.
Shipp M, Ross K, Tamayo P, Weng A, Kutok J, Aguiar R . Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med. 2002; 8(1):68-74. DOI: 10.1038/nm0102-68. View