» Articles » PMID: 33973773

UMAP As a Dimensionality Reduction Tool for Molecular Dynamics Simulations of Biomacromolecules: A Comparison Study

Overview
Journal J Phys Chem B
Specialty Chemistry
Date 2021 May 11
PMID 33973773
Citations 31
Authors
Affiliations
Soon will be listed here.
Abstract

Proteins are the molecular machines of life. The multitude of possible conformations that proteins can adopt determines their free-energy landscapes. However, the inherently high dimensionality of a protein free-energy landscape poses a challenge to deciphering how proteins perform their functions. For this reason, dimensionality reduction is an active field of research for molecular biologists. The uniform manifold approximation and projection (UMAP) is a dimensionality reduction method based on a fuzzy topological analysis of data. In the present study, the performance of UMAP is compared with that of other popular dimensionality reduction methods such as t-distributed stochastic neighbor embedding (t-SNE), principal component analysis (PCA), and time-structure independent components analysis (tICA) in the context of analyzing molecular dynamics simulations of the circadian clock protein VIVID. A good dimensionality reduction method should accurately represent the data structure on the projected components. The comparison of the raw high-dimensional data with the projections obtained using different dimensionality reduction methods based on various metrics showed that UMAP has superior performance when compared with linear reduction methods (PCA and tICA) and has competitive performance and scalable computational cost.

Citing Articles

Haematology dimension reduction, a large scale application to regular care haematology data.

Joosse H, Chumsaeng-Reijers C, Huisman A, Hoefer I, van Solinge W, Haitjema S BMC Med Inform Decis Mak. 2025; 25(1):75.

PMID: 39939843 PMC: 11823074. DOI: 10.1186/s12911-025-02899-8.


Extended Quality (eQual): Radial threshold clustering based on n-ary similarity.

Chen L, Smith M, Roe D, Alain Miranda-Quintana R bioRxiv. 2024; .

PMID: 39677679 PMC: 11643124. DOI: 10.1101/2024.12.05.627001.


Unsupervised learning analysis on the proteomes of Zika virus.

Lara-Ramirez E, Rivera G, Oliva-Hernandez A, Bocanegra-Garcia V, Lopez J, Guo X PeerJ Comput Sci. 2024; 10:e2443.

PMID: 39650519 PMC: 11623125. DOI: 10.7717/peerj-cs.2443.


Alzheimer's Disease Immunotherapy and Mimetic Peptide Design for Drug Development: Mutation Screening, Molecular Dynamics, and a Quantum Biochemistry Approach Focusing on Aducanumab::Aβ2-7 Binding Affinity.

Franca V, Bezerra E, da Costa R, Carvalho H, Freire V, Matos G ACS Chem Neurosci. 2024; 15(19):3543-3562.

PMID: 39302203 PMC: 11450751. DOI: 10.1021/acschemneuro.4c00453.


Streamlining NMR Chemical Shift Predictions for Intrinsically Disordered Proteins: Design of Ensembles with Dimensionality Reduction and Clustering.

Bakker M, Gaffour A, Juhas M, Zapletal V, Stosek J, Bratholm L J Chem Inf Model. 2024; 64(16):6542-6556.

PMID: 39099394 PMC: 11412307. DOI: 10.1021/acs.jcim.4c00809.


References
1.
Fowler D, Araya C, Fleishman S, Kellogg E, Stephany J, Baker D . High-resolution mapping of protein sequence-function relationships. Nat Methods. 2010; 7(9):741-6. PMC: 2938879. DOI: 10.1038/nmeth.1492. View

2.
Tian H, Tao P . Deciphering the protein motion of S1 subunit in SARS-CoV-2 spike glycoprotein through integrated computational methods. J Biomol Struct Dyn. 2020; 39(17):6705-6712. PMC: 7484573. DOI: 10.1080/07391102.2020.1802338. View

3.
Hensen U, Meyer T, Haas J, Rex R, Vriend G, Grubmuller H . Exploring protein dynamics space: the dynasome as the missing link between protein structure and function. PLoS One. 2012; 7(5):e33931. PMC: 3350514. DOI: 10.1371/journal.pone.0033931. View

4.
Orengo C, Todd A, Thornton J . From protein structure to function. Curr Opin Struct Biol. 1999; 9(3):374-82. DOI: 10.1016/S0959-440X(99)80051-7. View

5.
Hansson T, Oostenbrink C, van Gunsteren W . Molecular dynamics simulations. Curr Opin Struct Biol. 2002; 12(2):190-6. DOI: 10.1016/s0959-440x(02)00308-1. View