» Articles » PMID: 30295871

Multi-omic and Multi-view Clustering Algorithms: Review and Cancer Benchmark

Overview
Specialty Biochemistry
Date 2018 Oct 9
PMID 30295871
Citations 166
Authors
Affiliations
Soon will be listed here.
Abstract

Recent high throughput experimental methods have been used to collect large biomedical omics datasets. Clustering of single omic datasets has proven invaluable for biological and medical research. The decreasing cost and development of additional high throughput methods now enable measurement of multi-omic data. Clustering multi-omic data has the potential to reveal further systems-level insights, but raises computational and biological challenges. Here, we review algorithms for multi-omics clustering, and discuss key issues in applying these algorithms. Our review covers methods developed specifically for omic data as well as generic multi-view methods developed in the machine learning community for joint clustering of multiple data types. In addition, using cancer data from TCGA, we perform an extensive benchmark spanning ten different cancer types, providing the first systematic comparison of leading multi-omics and multi-view clustering algorithms. The results highlight key issues regarding the use of single- versus multi-omics, the choice of clustering strategy, the power of generic multi-view methods and the use of approximated p-values for gauging solution quality. Due to the growing use of multi-omics data, we expect these issues to be important for future progress in the field.

Citing Articles

Advancements in proteogenomics for preclinical targeted cancer therapy research.

Suo Y, Song Y, Wang Y, Liu Q, Rodriguez H, Zhou H Biophys Rep. 2025; 11(1):56-76.

PMID: 40070661 PMC: 11891078. DOI: 10.52601/bpr.2024.240053.


An Attention-Aware Multi-Task Learning Framework Identifies Candidate Targets for Drug Repurposing in Sarcopenia.

Reza M, Qiu C, Lin X, Su K, Liu A, Zhang X J Cachexia Sarcopenia Muscle. 2025; 16(2):e13661.

PMID: 40045692 PMC: 11883102. DOI: 10.1002/jcsm.13661.


Integrative Analysis of Metabolome and Proteome in the Cerebrospinal Fluid of Patients with Multiple System Atrophy.

George N, Kwon M, Jang Y, Kim S, Hwang J, Lee S Cells. 2025; 14(4).

PMID: 39996738 PMC: 11853536. DOI: 10.3390/cells14040265.


Mapping the knowledge of omics in myocardial infarction: A scientometric analysis in R Studio, VOSviewer, Citespace, and SciMAT.

Wei X, Wang M, Yu S, Han Z, Li C, Zhong Y Medicine (Baltimore). 2025; 104(7):e41368.

PMID: 39960900 PMC: 11835070. DOI: 10.1097/MD.0000000000041368.


Feature graphs for interpretable unsupervised tree ensembles: centrality, interaction, and application in disease subtyping.

Sirocchi C, Urschler M, Pfeifer B BioData Min. 2025; 18(1):15.

PMID: 39955586 PMC: 11829558. DOI: 10.1186/s13040-025-00430-3.


References
1.
Lock E, Hoadley K, Marron J, Nobel A . JOINT AND INDIVIDUAL VARIATION EXPLAINED (JIVE) FOR INTEGRATED ANALYSIS OF MULTIPLE DATA TYPES. Ann Appl Stat. 2013; 7(1):523-542. PMC: 3671601. DOI: 10.1214/12-AOAS597. View

2.
Lin D, Zhang J, Li J, Calhoun V, Deng H, Wang Y . Group sparse canonical correlation analysis for genomic data integration. BMC Bioinformatics. 2013; 14:245. PMC: 3751310. DOI: 10.1186/1471-2105-14-245. View

3.
Chen J, Bushman F, Lewis J, Wu G, Li H . Structure-constrained sparse canonical correlation analysis with an application to microbiome data analysis. Biostatistics. 2012; 14(2):244-58. PMC: 3590923. DOI: 10.1093/biostatistics/kxs038. View

4.
Gabasova E, Reid J, Wernisch L . Clusternomics: Integrative context-dependent clustering for heterogeneous datasets. PLoS Comput Biol. 2017; 13(10):e1005781. PMC: 5658176. DOI: 10.1371/journal.pcbi.1005781. View

5.
Vandin F, Papoutsaki A, Raphael B, Upfal E . Accurate computation of survival statistics in genome-wide studies. PLoS Comput Biol. 2015; 11(5):e1004071. PMC: 4423942. DOI: 10.1371/journal.pcbi.1004071. View