» Articles » PMID: 20140068

AffyPara-a Bioconductor Package for Parallelized Preprocessing Algorithms of Affymetrix Microarray Data

Overview
Publisher Sage Publications
Specialty Biology
Date 2010 Feb 9
PMID 20140068
Citations 4
Authors
Affiliations
Soon will be listed here.
Abstract

Microarray data repositories as well as large clinical applications of gene expression allow to analyse several hundreds of microarrays at one time. The preprocessing of large amounts of microarrays is still a challenge. The algorithms are limited by the available computer hardware. For example, building classification or prognostic rules from large microarray sets will be very time consuming. Here, preprocessing has to be a part of the cross-validation and resampling strategy which is necessary to estimate the rule's prediction quality honestly.This paper proposes the new Bioconductor package affyPara for parallelized preprocessing of Affymetrix microarray data. Partition of data can be applied on arrays and parallelization of algorithms is a straightforward consequence. The partition of data and distribution to several nodes solves the main memory problems and accelerates preprocessing by up to the factor 20 for 200 or more arrays.affyPara is a free and open source package, under GPL license, available form the Bioconductor project at www.bioconductor.org. A user guide and examples are provided with the package.

Citing Articles

From Genes to Metabolites: HSP90B1's Role in Alzheimer's Disease and Potential for Therapeutic Intervention.

Huang C, Liu Y, Wang S, Xia J, Hu D, Xu R Neuromolecular Med. 2025; 27(1):6.

PMID: 39760808 DOI: 10.1007/s12017-024-08822-0.


A fully scalable online pre-processing algorithm for short oligonucleotide microarray atlases.

Lahti L, Torrente A, Elo L, Brazma A, Rung J Nucleic Acids Res. 2013; 41(10):e110.

PMID: 23563154 PMC: 3664815. DOI: 10.1093/nar/gkt229.


Conceptual aspects of large meta-analyses with publicly available microarray data: a case study in oncology.

Schmidberger M, Lennert S, Mansmann U Bioinform Biol Insights. 2011; 5:13-39.

PMID: 21423405 PMC: 3045047. DOI: 10.4137/BBI.S5537.


AnyExpress: integrated toolkit for analysis of cross-platform gene expression data using a fast interval matching algorithm.

Kim J, Patel K, Jung H, Kuo W, Ohno-Machado L BMC Bioinformatics. 2011; 12:75.

PMID: 21410990 PMC: 3076267. DOI: 10.1186/1471-2105-12-75.

References
1.
Kostka D, Spang R . Microarray based diagnosis profits from better documentation of gene expression signatures. PLoS Comput Biol. 2008; 4(2):e22. PMC: 2242819. DOI: 10.1371/journal.pcbi.0040022. View

2.
Huber W, von Heydebreck A, Sultmann H, Poustka A, Vingron M . Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics. 2002; 18 Suppl 1:S96-104. DOI: 10.1093/bioinformatics/18.suppl_1.s96. View

3.
Irizarry R, Hobbs B, Collin F, Beazer-Barclay Y, Antonellis K, Scherf U . Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003; 4(2):249-64. DOI: 10.1093/biostatistics/4.2.249. View

4.
Bacher U, Kohlmann A, Haferlach T . Current status of gene expression profiling in the diagnosis and management of acute leukaemia. Br J Haematol. 2009; 145(5):555-68. DOI: 10.1111/j.1365-2141.2009.07656.x. View

5.
Gentleman R, Carey V, Bates D, Bolstad B, Dettling M, Dudoit S . Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004; 5(10):R80. PMC: 545600. DOI: 10.1186/gb-2004-5-10-r80. View