» Articles » PMID: 21386892

Removing Batch Effects in Analysis of Expression Microarray Data: an Evaluation of Six Batch Adjustment Methods

Overview
Journal PLoS One
Date 2011 Mar 10
PMID 21386892
Citations 278
Authors
Affiliations
Soon will be listed here.
Abstract

The expression microarray is a frequently used approach to study gene expression on a genome-wide scale. However, the data produced by the thousands of microarray studies published annually are confounded by "batch effects," the systematic error introduced when samples are processed in multiple batches. Although batch effects can be reduced by careful experimental design, they cannot be eliminated unless the whole study is done in a single batch. A number of programs are now available to adjust microarray data for batch effects prior to analysis. We systematically evaluated six of these programs using multiple measures of precision, accuracy and overall performance. ComBat, an Empirical Bayes method, outperformed the other five programs by most metrics. We also showed that it is essential to standardize expression data at the probe level when testing for correlation of expression profiles, due to a sizeable probe effect in microarray data that can inflate the correlation among replicates and unrelated samples.

Citing Articles

The BAMBOO method for correcting batch effects in high throughput proximity extension assays for proteomic studies.

Smits H, Delemarre E, Pandit A, Schoneveld A, Oldenburg B, van Wijk F Sci Rep. 2025; 15(1):1498.

PMID: 39789032 PMC: 11717925. DOI: 10.1038/s41598-024-84320-4.


ssMutPA: single-sample mutation-based pathway analysis approach for cancer precision medicine.

He Y, Lai J, Wang Q, Pan B, Li S, Zhao X Gigascience. 2024; 13.

PMID: 39704703 PMC: 11659979. DOI: 10.1093/gigascience/giae105.


Assessment of ComBat Harmonization Performance on Structural Magnetic Resonance Imaging Measurements.

Tassi E, Bianchi A, Calesella F, Vai B, Bellani M, Nenadic I Hum Brain Mapp. 2024; 45(18):e70085.

PMID: 39704541 PMC: 11660414. DOI: 10.1002/hbm.70085.


Mitigating Interobserver Variability in Radiomics with ComBat: A Feasibility Study.

DAnna A, Stella G, Gueli A, Marino C, Pulvirenti A J Imaging. 2024; 10(11).

PMID: 39590734 PMC: 11595722. DOI: 10.3390/jimaging10110270.


Differential Radiomics-Based Signature Predicts Lung Cancer Risk Accounting for Imaging Parameters in NLST Cohort.

Ebrahimpour L, Despres P, Manem V Cancer Med. 2024; 13(20):e70359.

PMID: 39463128 PMC: 11513548. DOI: 10.1002/cam4.70359.


References
1.
Zhang D, Cheng L, Badner J, Chen C, Chen Q, Luo W . Genetic control of individual differences in gene-specific methylation in human brain. Am J Hum Genet. 2010; 86(3):411-9. PMC: 2833385. DOI: 10.1016/j.ajhg.2010.02.005. View

2.
Johnson W, Li C, Rabinovic A . Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 2006; 8(1):118-27. DOI: 10.1093/biostatistics/kxj037. View

3.
McCall M, Irizarry R . Consolidated strategy for the analysis of microarray spike-in data. Nucleic Acids Res. 2008; 36(17):e108. PMC: 2553586. DOI: 10.1093/nar/gkn430. View

4.
Schena M, Shalon D, Davis R, Brown P . Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995; 270(5235):467-70. DOI: 10.1126/science.270.5235.467. View

5.
Alter O, Brown P, Botstein D . Singular value decomposition for genome-wide expression data processing and modeling. Proc Natl Acad Sci U S A. 2000; 97(18):10101-6. PMC: 27718. DOI: 10.1073/pnas.97.18.10101. View