» Articles » PMID: 21482760

Topology Based Data Analysis Identifies a Subgroup of Breast Cancers with a Unique Mutational Profile and Excellent Survival

Overview
Specialty Science
Date 2011 Apr 13
PMID 21482760
Citations 135
Authors
Affiliations
Soon will be listed here.
Abstract

High-throughput biological data, whether generated as sequencing, transcriptional microarrays, proteomic, or other means, continues to require analytic methods that address its high dimensional aspects. Because the computational part of data analysis ultimately identifies shape characteristics in the organization of data sets, the mathematics of shape recognition in high dimensions continues to be a crucial part of data analysis. This article introduces a method that extracts information from high-throughput microarray data and, by using topology, provides greater depth of information than current analytic techniques. The method, termed Progression Analysis of Disease (PAD), first identifies robust aspects of cluster analysis, then goes deeper to find a multitude of biologically meaningful shape characteristics in these data. Additionally, because PAD incorporates a visualization tool, it provides a simple picture or graph that can be used to further explore these data. Although PAD can be applied to a wide range of high-throughput data types, it is used here as an example to analyze breast cancer transcriptional data. This identified a unique subgroup of Estrogen Receptor-positive (ER(+)) breast cancers that express high levels of c-MYB and low levels of innate inflammatory genes. These patients exhibit 100% survival and no metastasis. No supervised step beyond distinction between tumor and healthy patients was used to identify this subtype. The group has a clear and distinct, statistically significant molecular signature, it highlights coherent biology but is invisible to cluster methods, and does not fit into the accepted classification of Luminal A/B, Normal-like subtypes of ER(+) breast cancers. We denote the group as c-MYB(+) breast cancer.

Citing Articles

A distribution-guided Mapper algorithm.

Tao Y, Ge S BMC Bioinformatics. 2025; 26(1):73.

PMID: 40045218 PMC: 11881416. DOI: 10.1186/s12859-025-06085-5.


Simplicity within biological complexity.

Przulj N, Malod-Dognin N Bioinform Adv. 2025; 5(1):vbae164.

PMID: 39927291 PMC: 11805345. DOI: 10.1093/bioadv/vbae164.


EiDA: A lossless approach for dynamic functional connectivity; application to fMRI data of a model of ageing.

Alteriis G, MacNicol E, Hancock F, Ciaramella A, Cash D, Expert P Imaging Neurosci (Camb). 2025; 2:1-22.

PMID: 39927148 PMC: 11801787. DOI: 10.1162/imag_a_00113.


Deconstructing the Mapper algorithm to extract richer topological and temporal features from functional neuroimaging data.

Hasegan D, Geniesse C, Chowdhury S, Saggar M Netw Neurosci. 2024; 8(4):1355-1382.

PMID: 39735492 PMC: 11675014. DOI: 10.1162/netn_a_00403.


Hierarchical simplicial manifold learning.

Zhang W, Shih Y, Li J PNAS Nexus. 2024; 3(12):pgae530.

PMID: 39660072 PMC: 11631340. DOI: 10.1093/pnasnexus/pgae530.


References
1.
Sorlie T, Tibshirani R, Parker J, Hastie T, Marron J, Nobel A . Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci U S A. 2003; 100(14):8418-23. PMC: 166244. DOI: 10.1073/pnas.0932692100. View

2.
Deisenroth C, Thorner A, Enomoto T, Perou C, Zhang Y . Mitochondrial Hep27 is a c-Myb target gene that inhibits Mdm2 and stabilizes p53. Mol Cell Biol. 2010; 30(16):3981-93. PMC: 2916441. DOI: 10.1128/MCB.01284-09. View

3.
Ramsay R, Gonda T . MYB function in normal and cancer cells. Nat Rev Cancer. 2008; 8(7):523-34. DOI: 10.1038/nrc2439. View

4.
Tibshirani R, Hastie T, Narasimhan B, Chu G . Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci U S A. 2002; 99(10):6567-72. PMC: 124443. DOI: 10.1073/pnas.082099299. View

5.
Parker J, Mullins M, Cheang M, Leung S, Voduc D, Vickery T . Supervised risk predictor of breast cancer based on intrinsic subtypes. J Clin Oncol. 2009; 27(8):1160-7. PMC: 2667820. DOI: 10.1200/JCO.2008.18.1370. View