» Articles » PMID: 39015742

Simple and Scalable Algorithms for Cluster-Aware Precision Medicine

Overview
Date 2024 Jul 17
PMID 39015742
Authors
Affiliations
Soon will be listed here.
Abstract

AI-enabled precision medicine promises a transformational improvement in healthcare outcomes. However, training on biomedical data presents significant challenges as they are often high dimensional, clustered, and of limited sample size. To overcome these challenges, we propose a simple and scalable approach for cluster-aware embedding that combines latent factor methods with a convex clustering penalty in a modular way. Our novel approach overcomes the complexity and limitations of current joint embedding and clustering methods and enables hierarchically clustered principal component analysis (PCA), locally linear embedding (LLE), and canonical correlation analysis (CCA). Through numerical experiments and real-world examples, we demonstrate that our approach outperforms fourteen clustering methods on highly underdetermined problems (e.g., with limited sample size) as well as on large sample datasets. Importantly, our approach does not require the user to choose the desired number of clusters, yields improved model selection if they do, and yields interpretable hierarchically clustered embedding dendrograms. Thus, our approach improves significantly on existing methods for identifying patient subgroups in multiomics and neuroimaging data and enables scalable and interpretable biomarkers for precision medicine.

References
1.
Wang M, Yao T, Allen G . Supervised convex clustering. Biometrics. 2023; 79(4):3846-3858. DOI: 10.1111/biom.13860. View

2.
Di Martino A, OConnor D, Chen B, Alaerts K, Anderson J, Assaf M . Enhancing studies of the connectome in autism using the autism brain imaging data exchange II. Sci Data. 2017; 4:170010. PMC: 5349246. DOI: 10.1038/sdata.2017.10. View

3.
Fogel P, Gaston-Mathe Y, Hawkins D, Fogel F, Luta G, Young S . Applications of a Novel Clustering Approach Using Non-Negative Matrix Factorization to Environmental Research in Public Health. Int J Environ Res Public Health. 2016; 13(5). PMC: 4881134. DOI: 10.3390/ijerph13050509. View

4.
Santos C, Sanz-Pamplona R, Nadal E, Grasselli J, Pernas S, Dienstmann R . Intrinsic cancer subtypes--next steps into personalized medicine. Cell Oncol (Dordr). 2015; 38(1):3-16. DOI: 10.1007/s13402-014-0203-7. View

5.
Lakkis J, Wang D, Zhang Y, Hu G, Wang K, Pan H . A joint deep learning model enables simultaneous batch effect correction, denoising, and clustering in single-cell transcriptomics. Genome Res. 2021; 31(10):1753-1766. PMC: 8494213. DOI: 10.1101/gr.271874.120. View