» Articles » PMID: 12925520

Exploration, Normalization, and Summaries of High Density Oligonucleotide Array Probe Level Data

Overview
Journal Biostatistics
Specialty Public Health
Date 2003 Aug 20
PMID 12925520
Citations 5806
Authors
Affiliations
Soon will be listed here.
Abstract

In this paper we report exploratory analyses of high-density oligonucleotide array data from the Affymetrix GeneChip system with the objective of improving upon currently used measures of gene expression. Our analyses make use of three data sets: a small experimental study consisting of five MGU74A mouse GeneChip arrays, part of the data from an extensive spike-in study conducted by Gene Logic and Wyeth's Genetics Institute involving 95 HG-U95A human GeneChip arrays; and part of a dilution study conducted by Gene Logic involving 75 HG-U95A GeneChip arrays. We display some familiar features of the perfect match and mismatch probe (PM and MM) values of these data, and examine the variance-mean relationship with probe-level data from probes believed to be defective, and so delivering noise only. We explain why we need to normalize the arrays to one another using probe level intensities. We then examine the behavior of the PM and MM using spike-in data and assess three commonly used summary measures: Affymetrix's (i) average difference (AvDiff) and (ii) MAS 5.0 signal, and (iii) the Li and Wong multiplicative model-based expression index (MBEI). The exploratory data analyses of the probe level data motivate a new summary measure that is a robust multi-array average (RMA) of background-adjusted, normalized, and log-transformed PM values. We evaluate the four expression summary measures using the dilution study data, assessing their behavior in terms of bias, variance and (for MBEI and RMA) model fit. Finally, we evaluate the algorithms in terms of their ability to detect known levels of differential expression using the spike-in data. We conclude that there is no obvious downside to using RMA and attaching a standard error (SE) to this quantity using a linear model which removes probe-specific affinities.

Citing Articles

Transgenerational inheritance of hepatic steatosis in mice: sperm methylome is largely reprogrammed and inherited but does not globally influence liver transcriptome.

Ribo S, Ramon-Krauel M, Marimon-Escude J, Busato F, Palmieri F, Mourin-Fernandez M Environ Epigenet. 2025; 11(1):dvaf003.

PMID: 40040952 PMC: 11879089. DOI: 10.1093/eep/dvaf003.


Revolutionizing the treatment of intervertebral disc degeneration: an approach based on molecular typing.

Chen S, Zhang W, Liu Y, Huang R, Zhou X, Wei X J Transl Med. 2025; 23(1):227.

PMID: 40001145 PMC: 11863857. DOI: 10.1186/s12967-025-06225-8.


Youth Who Control HIV on Antiretroviral Therapy Display Unique Plasma Biomarkers and Cellular Transcriptome Profiles Including DNA Repair and RNA Processing.

Borkar S, Yin L, Venturi G, Shen J, Chang K, Fischer B Cells. 2025; 14(4).

PMID: 39996757 PMC: 11853983. DOI: 10.3390/cells14040285.


Identification of AK4 and RHOC as potential oncogenes addicted by adult T cell leukemia.

Liu B, Yasunaga J, Liang Y, Zhou R, Yang S, Yuan X Proc Natl Acad Sci U S A. 2025; 122(8):e2416412122.

PMID: 39982744 PMC: 11874535. DOI: 10.1073/pnas.2416412122.


Loss of VSTM2A promotes adipocyte hypertrophy and disrupts metabolic homeostasis.

Al Dow M, Secco B, Mouchiroud M, Rochette M, Gilio G, Massicard M Obesity (Silver Spring). 2025; 33(3):522-536.

PMID: 39956640 PMC: 11897849. DOI: 10.1002/oby.24224.