» Articles » PMID: 18654623

Nonnegative Matrix Factorization: an Analytical and Interpretive Tool in Computational Biology

Overview
Specialty Biology
Date 2008 Jul 26
PMID 18654623
Citations 131
Authors
Affiliations
Soon will be listed here.
Abstract

In the last decade, advances in high-throughput technologies such as DNA microarrays have made it possible to simultaneously measure the expression levels of tens of thousands of genes and proteins. This has resulted in large amounts of biological data requiring analysis and interpretation. Nonnegative matrix factorization (NMF) was introduced as an unsupervised, parts-based learning paradigm involving the decomposition of a nonnegative matrix V into two nonnegative matrices, W and H, via a multiplicative updates algorithm. In the context of a pxn gene expression matrix V consisting of observations on p genes from n samples, each column of W defines a metagene, and each column of H represents the metagene expression pattern of the corresponding sample. NMF has been primarily applied in an unsupervised setting in image and natural language processing. More recently, it has been successfully utilized in a variety of applications in computational biology. Examples include molecular pattern discovery, class comparison and prediction, cross-platform and cross-species analysis, functional characterization of genes and biomedical informatics. In this paper, we review this method as a data analytical and interpretive tool in computational biology with an emphasis on these applications.

Citing Articles

Population-level analyses identify host and environmental variables influencing the vaginal microbiome.

Qin L, Sun T, Li X, Zhao S, Liu Z, Zhang C Signal Transduct Target Ther. 2025; 10(1):64.

PMID: 39966341 PMC: 11836416. DOI: 10.1038/s41392-025-02152-8.


Simplicity within biological complexity.

Przulj N, Malod-Dognin N Bioinform Adv. 2025; 5(1):vbae164.

PMID: 39927291 PMC: 11805345. DOI: 10.1093/bioadv/vbae164.


CSI-GEP: A GPU-based unsupervised machine learning approach for recovering gene expression programs in atlas-scale single-cell RNA-seq data.

Liu X, Chapple R, Bennett D, Wright W, Sanjali A, Culp E Cell Genom. 2025; 5(1):100739.

PMID: 39788105 PMC: 11770216. DOI: 10.1016/j.xgen.2024.100739.


Decomposition of the pangenome matrix reveals a structure in gene distribution in the species.

Chauhan S, Ardalani O, Hyun J, Monk J, Phaneuf P, Palsson B mSphere. 2025; 10(1):e0053224.

PMID: 39745367 PMC: 11774025. DOI: 10.1128/msphere.00532-24.


Nonnegative matrix factorization for analyzing state dependent neuronal network dynamics in calcium recordings.

Carbonero D, Noueihed J, Kramer M, White J Sci Rep. 2024; 14(1):27899.

PMID: 39537711 PMC: 11560946. DOI: 10.1038/s41598-024-78448-6.


References
1.
Costanzo M, Crawford M, Hirschman J, Kranz J, Olsen P, ROBERTSON L . YPD, PombePD and WormPD: model organism volumes of the BioKnowledge library, an integrated resource for protein information. Nucleic Acids Res. 2000; 29(1):75-9. PMC: 29810. DOI: 10.1093/nar/29.1.75. View

2.
Kelm B, Menze B, Zechmann C, Baudendistel K, Hamprecht F . Automated estimation of tumor probability in prostate magnetic resonance spectroscopic imaging: pattern recognition vs quantification. Magn Reson Med. 2006; 57(1):150-9. DOI: 10.1002/mrm.21112. View

3.
Pascual-Montano A, Carmona-Saez P, Chagoyen M, Tirado F, Carazo J, Pascual-Marqui R . bioNMF: a versatile tool for non-negative matrix factorization in biology. BMC Bioinformatics. 2006; 7:366. PMC: 1550731. DOI: 10.1186/1471-2105-7-366. View

4.
Kim P, Tidor B . Subsystem identification through dimensionality reduction of large-scale gene expression data. Genome Res. 2003; 13(7):1706-18. PMC: 403744. DOI: 10.1101/gr.903503. View

5.
Gao Y, Church G . Improving molecular cancer class discovery through sparse non-negative matrix factorization. Bioinformatics. 2005; 21(21):3970-5. DOI: 10.1093/bioinformatics/bti653. View