» Articles » PMID: 31603461

Deep-learning Approach to Identifying Cancer Subtypes Using High-dimensional Genomic Data

Overview
Journal Bioinformatics
Specialty Biology
Date 2019 Oct 12
PMID 31603461
Citations 42
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: Cancer subtype classification has the potential to significantly improve disease prognosis and develop individualized patient management. Existing methods are limited by their ability to handle extremely high-dimensional data and by the influence of misleading, irrelevant factors, resulting in ambiguous and overlapping subtypes.

Results: To address the above issues, we proposed a novel approach to disentangling and eliminating irrelevant factors by leveraging the power of deep learning. Specifically, we designed a deep-learning framework, referred to as DeepType, that performs joint supervised classification, unsupervised clustering and dimensionality reduction to learn cancer-relevant data representation with cluster structure. We applied DeepType to the METABRIC breast cancer dataset and compared its performance to state-of-the-art methods. DeepType significantly outperformed the existing methods, identifying more robust subtypes while using fewer genes. The new approach provides a framework for the derivation of more accurate and robust molecular cancer subtypes by using increasingly complex, multi-source data.

Availability And Implementation: An open-source software package for the proposed method is freely available at http://www.acsu.buffalo.edu/~yijunsun/lab/DeepType.html.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Citing Articles

A generative deep neural network for pan-digestive tract cancer survival analysis.

Xu L, Lan T, Huang Y, Wang L, Lin J, Song X BioData Min. 2025; 18(1):9.

PMID: 39871331 PMC: 11771125. DOI: 10.1186/s13040-025-00426-z.


Cancer molecular subtyping using limited multi-omics data with missingness.

Bu Y, Liang J, Li Z, Wang J, Wang J, Yu G PLoS Comput Biol. 2024; 20(12):e1012710.

PMID: 39724112 PMC: 11709273. DOI: 10.1371/journal.pcbi.1012710.


Fine-grained Patient Similarity Measuring using Contrastive Graph Similarity Networks.

Liu Y, Zhang Z, Qin S, Salim F, Bian J, Jimeno Yepes A Proc (IEEE Int Conf Healthc Inform). 2024; 2024:1-10.

PMID: 39698046 PMC: 11654828. DOI: 10.1109/ichi61247.2024.00009.


Multi-fusion strategy network-guided cancer subtypes discovering based on multi-omics data.

Liu J, Xue X, Wen P, Song Q, Yao J, Ge S Front Genet. 2024; 15:1466825.

PMID: 39610828 PMC: 11602503. DOI: 10.3389/fgene.2024.1466825.


Translation of Epigenetics in Cell-Free DNA Liquid Biopsy Technology and Precision Oncology.

Tan W, Nagabhyrava S, Ang-Olson O, Das P, Ladel L, Sailo B Curr Issues Mol Biol. 2024; 46(7):6533-6565.

PMID: 39057032 PMC: 11276574. DOI: 10.3390/cimb46070390.


References
1.
Sotiriou C, Wirapati P, Loi S, Harris A, Fox S, Smeds J . Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J Natl Cancer Inst. 2006; 98(4):262-72. DOI: 10.1093/jnci/djj052. View

2.
Johnson W, Li C, Rabinovic A . Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 2006; 8(1):118-27. DOI: 10.1093/biostatistics/kxj037. View

3.
Hanahan D, Weinberg R . Hallmarks of cancer: the next generation. Cell. 2011; 144(5):646-74. DOI: 10.1016/j.cell.2011.02.013. View

4.
Sorlie T, Tibshirani R, Parker J, Hastie T, Marron J, Nobel A . Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci U S A. 2003; 100(14):8418-23. PMC: 166244. DOI: 10.1073/pnas.0932692100. View

5.
Kapp A, Tibshirani R . Are clusters found in one dataset present in another dataset?. Biostatistics. 2006; 8(1):9-31. DOI: 10.1093/biostatistics/kxj029. View