» Articles » PMID: 14693816

Adjustment of Systematic Microarray Data Biases

Overview
Journal Bioinformatics
Specialty Biology
Date 2003 Dec 25
PMID 14693816
Citations 205
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: Systematic differences due to experimental features of microarray experiments are present in most large microarray data sets. Many different experimental features can cause biases including different sources of RNA, different production lots of microarrays or different microarray platforms. These systematic effects present a substantial hurdle to the analysis of microarray data.

Results: We present here a new method for the identification and adjustment of systematic biases that are present within microarray data sets. Our approach is based on modern statistical discrimination methods and is shown to be very effective in removing systematic biases present in a previously published breast tumor cDNA microarray data set. The new method of 'Distance Weighted Discrimination (DWD)' is shown to be better than Support Vector Machines and Singular Value Decomposition for the adjustment of systematic microarray effects. In addition, it is shown to be of general use as a tool for the discrimination of systematic problems present in microarray data sets, including the merging of two breast tumor data sets completed on different microarray platforms.

Availability: Matlab software to perform DWD can be retrieved from https://genome.unc.edu/pubsup/dwd/

Citing Articles

Accurate identification of medulloblastoma subtypes from diverse data sources with severe batch effects by RaMBat.

Sun M, Wang J, Wan S bioRxiv. 2025; .

PMID: 40060540 PMC: 11888263. DOI: 10.1101/2025.02.24.640010.


Comparison and development of cross-study normalization methods for inter-species transcriptional analysis.

Feldman S, Ner-Gaon H, Treister E, Shay T PLoS One. 2024; 19(9):e0307997.

PMID: 39255285 PMC: 11386461. DOI: 10.1371/journal.pone.0307997.


A comprehensive overview of microbiome data in the light of machine learning applications: categorization, accessibility, and future directions.

Kumar B, Lorusso E, Fosso B, Pesole G Front Microbiol. 2024; 15:1343572.

PMID: 38419630 PMC: 10900530. DOI: 10.3389/fmicb.2024.1343572.


Batch-effect correction with sample remeasurement in highly confounded case-control studies.

Ye H, Zhang X, Wang C, Goode E, Chen J Nat Comput Sci. 2024; 3(8):709-719.

PMID: 38177326 PMC: 10993308. DOI: 10.1038/s43588-023-00500-8.


pyComBat, a Python tool for batch effects correction in high-throughput molecular data using empirical Bayes methods.

Behdenna A, Colange M, Haziza J, Gema A, Appe G, Azencott C BMC Bioinformatics. 2023; 24(1):459.

PMID: 38057718 PMC: 10701943. DOI: 10.1186/s12859-023-05578-5.