» Articles » PMID: 24565159

Shrinkage Regression-based Methods for Microarray Missing Value Imputation

Overview
Journal BMC Syst Biol
Publisher Biomed Central
Specialty Biology
Date 2014 Feb 26
PMID 24565159
Citations 3
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Missing values commonly occur in the microarray data, which usually contain more than 5% missing values with up to 90% of genes affected. Inaccurate missing value estimation results in reducing the power of downstream microarray data analyses. Many types of methods have been developed to estimate missing values. Among them, the regression-based methods are very popular and have been shown to perform better than the other types of methods in many testing microarray datasets.

Results: To further improve the performances of the regression-based methods, we propose shrinkage regression-based methods. Our methods take the advantage of the correlation structure in the microarray data and select similar genes for the target gene by Pearson correlation coefficients. Besides, our methods incorporate the least squares principle, utilize a shrinkage estimation approach to adjust the coefficients of the regression model, and then use the new coefficients to estimate missing values. Simulation results show that the proposed methods provide more accurate missing value estimation in six testing microarray datasets than the existing regression-based methods do.

Conclusions: Imputation of missing values is a very important aspect of microarray data analyses because most of the downstream analyses require a complete dataset. Therefore, exploring accurate and efficient methods for estimating missing values has become an essential issue. Since our proposed shrinkage regression-based methods can provide accurate missing value estimation, they are competitive alternatives to the existing regression-based methods.

Citing Articles

Use of meat juice and blood serum with a miniaturised protein microarray assay to develop a multi-parameter IgG screening test with high sample throughput potential for slaughtering pigs.

Loreck K, Mitrenga S, Heinze R, Ehricht R, Engemann C, Lueken C BMC Vet Res. 2020; 16(1):106.

PMID: 32252773 PMC: 7137480. DOI: 10.1186/s12917-020-02308-4.


Development of a miniaturized protein microarray as a new serological IgG screening test for zoonotic agents and production diseases in pigs.

Loreck K, Mitrenga S, Meemken D, Heinze R, Reissig A, Mueller E PLoS One. 2019; 14(5):e0217290.

PMID: 31116794 PMC: 6530865. DOI: 10.1371/journal.pone.0217290.


MVIAeval: a web tool for comprehensively evaluating the performance of a new missing value imputation algorithm.

Wu W, Jhou M BMC Bioinformatics. 2017; 18(1):31.

PMID: 28086746 PMC: 5237319. DOI: 10.1186/s12859-016-1429-3.

References
1.
Wu W, Li W, Chen B . Identifying regulatory targets of cell cycle transcription factors using gene expression and ChIP-chip data. BMC Bioinformatics. 2007; 8:188. PMC: 1906835. DOI: 10.1186/1471-2105-8-188. View

2.
Alizadeh A, Eisen M, Davis R, Ma C, Lossos I, Rosenwald A . Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature. 2000; 403(6769):503-11. DOI: 10.1038/35000501. View

3.
Rowicka M, Kudlicki A, Tu B, Otwinowski Z . High-resolution timing of cell cycle-regulated gene expression. Proc Natl Acad Sci U S A. 2007; 104(43):16892-7. PMC: 2040468. DOI: 10.1073/pnas.0706022104. View

4.
Zhang X, Song X, Wang H, Zhang H . Sequential local least squares imputation estimating missing value of microarray data. Comput Biol Med. 2008; 38(10):1112-20. DOI: 10.1016/j.compbiomed.2008.08.006. View

5.
Schena M, Shalon D, Davis R, Brown P . Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995; 270(5235):467-70. DOI: 10.1126/science.270.5235.467. View