» Articles » PMID: 22808018

Characteristic Gene Selection Via Weighting Principal Components by Singular Values

Overview
Journal PLoS One
Date 2012 Jul 19
PMID 22808018
Citations 4
Authors
Affiliations
Soon will be listed here.
Abstract

Conventional gene selection methods based on principal component analysis (PCA) use only the first principal component (PC) of PCA or sparse PCA to select characteristic genes. These methods indeed assume that the first PC plays a dominant role in gene selection. However, in a number of cases this assumption is not satisfied, so the conventional PCA-based methods usually provide poor selection results. In order to improve the performance of the PCA-based gene selection method, we put forward the gene selection method via weighting PCs by singular values (WPCS). Because different PCs have different importance, the singular values are exploited as the weights to represent the influence on gene selection of different PCs. The ROC curves and AUC statistics on artificial data show that our method outperforms the state-of-the-art methods. Moreover, experimental results on real gene expression data sets show that our method can extract more characteristic genes in response to abiotic stresses than conventional gene selection methods.

Citing Articles

A P-Norm Robust Feature Extraction Method for Identifying Differentially Expressed Genes.

Liu J, Liu J, Gao Y, Kong X, Wang X, Wang D PLoS One. 2015; 10(7):e0133124.

PMID: 26201006 PMC: 4511795. DOI: 10.1371/journal.pone.0133124.


A class-information-based penalized matrix decomposition for identifying plants core genes responding to abiotic stresses.

Liu J, Liu J, Gao Y, Mi J, Ma C, Wang D PLoS One. 2014; 9(9):e106097.

PMID: 25180509 PMC: 4152128. DOI: 10.1371/journal.pone.0106097.


Applications of Bayesian gene selection and classification with mixtures of generalized singular g-priors.

Chien W, Hsiao C Comput Math Methods Med. 2014; 2013:420412.

PMID: 24382981 PMC: 3870637. DOI: 10.1155/2013/420412.


Robust PCA based method for discovering differentially expressed genes.

Liu J, Wang Y, Zheng C, Sha W, Mi J, Xu Y BMC Bioinformatics. 2013; 14 Suppl 8:S3.

PMID: 23815087 PMC: 3654929. DOI: 10.1186/1471-2105-14-S8-S3.

References
1.
Sampson D, Parker T, Upton Z, Hurst C . A comparison of methods for classifying clinical samples based on proteomics data: a case study for statistical and machine learning approaches. PLoS One. 2011; 6(9):e24973. PMC: 3182169. DOI: 10.1371/journal.pone.0024973. View

2.
Craigon D, James N, Okyere J, Higgins J, Jotham J, May S . NASCArrays: a repository for microarray data generated by NASC's transcriptomics service. Nucleic Acids Res. 2003; 32(Database issue):D575-7. PMC: 308867. DOI: 10.1093/nar/gkh133. View

3.
Seki M, Narusaka M, Ishida J, Nanjo T, Fujita M, Oono Y . Monitoring the expression profiles of 7000 Arabidopsis genes under drought, cold and high-salinity stresses using a full-length cDNA microarray. Plant J. 2002; 31(3):279-92. DOI: 10.1046/j.1365-313x.2002.01359.x. View

4.
Josefsen K, Nielsen H . Northern blotting analysis. Methods Mol Biol. 2010; 703:87-105. DOI: 10.1007/978-1-59745-248-9_7. View

5.
Boyle E, Weng S, Gollub J, Jin H, Botstein D, Cherry J . GO::TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics. 2004; 20(18):3710-5. PMC: 3037731. DOI: 10.1093/bioinformatics/bth456. View