» Articles » PMID: 21129181

Merged Consensus Clustering to Assess and Improve Class Discovery with Microarray Data

Overview
Publisher Biomed Central
Specialty Biology
Date 2010 Dec 7
PMID 21129181
Citations 18
Authors
Affiliations
Soon will be listed here.
Abstract

Background: One of the most commonly performed tasks when analysing high throughput gene expression data is to use clustering methods to classify the data into groups. There are a large number of methods available to perform clustering, but it is often unclear which method is best suited to the data and how to quantify the quality of the classifications produced.

Results: Here we describe an R package containing methods to analyse the consistency of clustering results from any number of different clustering methods using resampling statistics. These methods allow the identification of the the best supported clusters and additionally rank cluster members by their fidelity within the cluster. These metrics allow us to compare the performance of different clustering algorithms under different experimental conditions and to select those that produce the most reliable clustering structures. We show the application of this method to simulated data, canonical gene expression experiments and our own novel analysis of genes involved in the specification of the peripheral nervous system in the fruitfly, Drosophila melanogaster.

Conclusions: Our package enables users to apply the merged consensus clustering methodology conveniently within the R programming environment, providing both analysis and graphical display functions for exploring clustering approaches. It extends the basic principle of consensus clustering by allowing the merging of results between different methods to provide an averaged clustering robustness. We show that this extension is useful in correcting for the tendency of clustering algorithms to treat outliers differently within datasets. The R package, clusterCons, is freely available at CRAN and sourceforge under the GNU public licence.

Citing Articles

BioNAR: an integrated biological network analysis package in bioconductor.

McLean C, Sorokin A, Simpson T, Armstrong J, Sorokina O Bioinform Adv. 2023; 3(1):vbad137.

PMID: 37860105 PMC: 10582516. DOI: 10.1093/bioadv/vbad137.


Unsupervised Algorithms for Microarray Sample Stratification.

Fratello M, Cattelani L, Federico A, Pavel A, Scala G, Serra A Methods Mol Biol. 2021; 2401:121-146.

PMID: 34902126 DOI: 10.1007/978-1-0716-1839-4_9.


Dissecting the Shared and Context-Dependent Pathways Mediated by the p140Cap Adaptor Protein in Cancer and in Neurons.

Chapelle J, Sorokina O, McLean C, Salemme V, Alfieri A, Angelini C Front Cell Dev Biol. 2019; 7:222.

PMID: 31681758 PMC: 6803390. DOI: 10.3389/fcell.2019.00222.


Regional Diversity in the Postsynaptic Proteome of the Mouse Brain.

Roy M, Sorokina O, McLean C, Tapia-Gonzalez S, DeFelipe J, Armstrong J Proteomes. 2018; 6(3).

PMID: 30071621 PMC: 6161190. DOI: 10.3390/proteomes6030031.


Synaptic Interactome Mining Reveals p140Cap as a New Hub for PSD Proteins Involved in Psychiatric and Neurological Disorders.

Alfieri A, Sorokina O, Adrait A, Angelini C, Russo I, Morellato A Front Mol Neurosci. 2017; 10:212.

PMID: 28713243 PMC: 5492163. DOI: 10.3389/fnmol.2017.00212.


References
1.
Frades I, Matthiesen R . Overview on techniques in cluster analysis. Methods Mol Biol. 2009; 593:81-107. DOI: 10.1007/978-1-60327-194-3_5. View

2.
Dudoit S, Fridlyand J . Bagging to improve the accuracy of a clustering procedure. Bioinformatics. 2003; 19(9):1090-9. DOI: 10.1093/bioinformatics/btg038. View

3.
Thalamuthu A, Mukhopadhyay I, Zheng X, Tseng G . Evaluation and comparison of gene clustering methods in microarray analysis. Bioinformatics. 2006; 22(19):2405-12. DOI: 10.1093/bioinformatics/btl406. View

4.
Milagre S, Maciel C, Pereira J, Pereira A . Fuzzy cluster stability analysis with missing values using resampling. Int J Bioinform Res Appl. 2009; 5(2):207-23. DOI: 10.1504/IJBRA.2009.024038. View

5.
Camp R, Neumeister V, Rimm D . A decade of tissue microarrays: progress in the discovery and validation of cancer biomarkers. J Clin Oncol. 2008; 26(34):5630-7. DOI: 10.1200/JCO.2008.17.3567. View