» Articles » PMID: 22151536

Empirical Comparison of Cross-platform Normalization Methods for Gene Expression Data

Overview
Publisher Biomed Central
Specialty Biology
Date 2011 Dec 14
PMID 22151536
Citations 50
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Simultaneous measurement of gene expression on a genomic scale can be accomplished using microarray technology or by sequencing based methods. Researchers who perform high throughput gene expression assays often deposit their data in public databases, but heterogeneity of measurement platforms leads to challenges for the combination and comparison of data sets. Researchers wishing to perform cross platform normalization face two major obstacles. First, a choice must be made about which method or methods to employ. Nine are currently available, and no rigorous comparison exists. Second, software for the selected method must be obtained and incorporated into a data analysis workflow.

Results: Using two publicly available cross-platform testing data sets, cross-platform normalization methods are compared based on inter-platform concordance and on the consistency of gene lists obtained with transformed data. Scatter and ROC-like plots are produced and new statistics based on those plots are introduced to measure the effectiveness of each method. Bootstrapping is employed to obtain distributions for those statistics. The consistency of platform effects across studies is explored theoretically and with respect to the testing data sets.

Conclusions: Our comparisons indicate that four methods, DWD, EB, GQ, and XPN, are generally effective, while the remaining methods do not adequately correct for platform effects. Of the four successful methods, XPN generally shows the highest inter-platform concordance when treatment groups are equally sized, while DWD is most robust to differently sized treatment groups and consistently shows the smallest loss in gene detection. We provide an R package, CONOR, capable of performing the nine cross-platform normalization methods considered. The package can be downloaded at http://alborz.sdsu.edu/conor and is available from CRAN.

Citing Articles

Comparison and development of cross-study normalization methods for inter-species transcriptional analysis.

Feldman S, Ner-Gaon H, Treister E, Shay T PLoS One. 2024; 19(9):e0307997.

PMID: 39255285 PMC: 11386461. DOI: 10.1371/journal.pone.0307997.


Uniformly shaped harmonization combines human transcriptomic data from different platforms while retaining their biological properties and differential gene expression patterns.

Borisov N, Tkachev V, Simonov A, Sorokin M, Kim E, Kuzmin D Front Mol Biosci. 2023; 10:1237129.

PMID: 37745690 PMC: 10511763. DOI: 10.3389/fmolb.2023.1237129.


Assessing equivalent and inverse change in genes between diverse experiments.

Neums L, Koestler D, Xia Q, Hu J, Patel S, Bell-Glenn S Front Bioinform. 2022; 2:893032.

PMID: 36304274 PMC: 9580844. DOI: 10.3389/fbinf.2022.893032.


Transcriptomic Harmonization as the Way for Suppressing Cross-Platform Bias and Batch Effect.

Borisov N, Buzdin A Biomedicines. 2022; 10(9).

PMID: 36140419 PMC: 9496268. DOI: 10.3390/biomedicines10092318.


Microarray Data Preprocessing: From Experimental Design to Differential Analysis.

Federico A, Saarimaki L, Serra A, Del Giudice G, Kinaret P, Scala G Methods Mol Biol. 2021; 2401:79-100.

PMID: 34902124 DOI: 10.1007/978-1-0716-1839-4_7.


References
1.
Grutzmann R, Boriss H, Ammerpohl O, Luttges J, Kalthoff H, Schackert H . Meta-analysis of microarray data on pancreatic cancer defines a set of commonly dysregulated genes. Oncogene. 2005; 24(32):5079-88. DOI: 10.1038/sj.onc.1208696. View

2.
Tan P, Downey T, Spitznagel Jr E, Xu P, Fu D, Dimitrov D . Evaluation of gene expression measurements from commercial microarray platforms. Nucleic Acids Res. 2003; 31(19):5676-84. PMC: 206463. DOI: 10.1093/nar/gkg763. View

3.
Hekstra D, Taussig A, Magnasco M, Naef F . Absolute mRNA concentrations from sequence-specific calibration of oligonucleotide arrays. Nucleic Acids Res. 2003; 31(7):1962-8. PMC: 152799. DOI: 10.1093/nar/gkg283. View

4.
Lacson R, Pitzer E, Kim J, Galante P, Hinske C, Ohno-Machado L . DSGeo: software tools for cross-platform analysis of gene expression data in GEO. J Biomed Inform. 2010; 43(5):709-15. PMC: 2934864. DOI: 10.1016/j.jbi.2010.04.007. View

5.
Benito M, Parker J, Du Q, Wu J, Xiang D, Perou C . Adjustment of systematic microarray data biases. Bioinformatics. 2003; 20(1):105-14. DOI: 10.1093/bioinformatics/btg385. View