» Articles » PMID: 23591137

The Non-negative Matrix Factorization Toolbox for Biological Data Mining

Overview
Publisher Biomed Central
Specialty Biology
Date 2013 Apr 18
PMID 23591137
Citations 41
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Non-negative matrix factorization (NMF) has been introduced as an important method for mining biological data. Though there currently exists packages implemented in R and other programming languages, they either provide only a few optimization algorithms or focus on a specific application field. There does not exist a complete NMF package for the bioinformatics community, and in order to perform various data mining tasks on biological data.

Results: We provide a convenient MATLAB toolbox containing both the implementations of various NMF techniques and a variety of NMF-based data mining approaches for analyzing biological data. Data mining approaches implemented within the toolbox include data clustering and bi-clustering, feature extraction and selection, sample classification, missing values imputation, data visualization, and statistical comparison.

Conclusions: A series of analysis such as molecular pattern discovery, biological process identification, dimension reduction, disease prediction, visualization, and statistical comparison can be performed using this toolbox.

Citing Articles

Muscle Synergy Analysis as a Tool for Assessing the Effectiveness of Gait Rehabilitation Therapies: A Methodological Review and Perspective.

Borzelli D, DE Marchis C, Quercia A, de Pasquale P, Casile A, Quartarone A Bioengineering (Basel). 2024; 11(8).

PMID: 39199751 PMC: 11351442. DOI: 10.3390/bioengineering11080793.


Elucidating immune-related gene transcriptional programs via factorization of large-scale RNA-profiles.

He S, Gubin M, Rafei H, Basar R, Dede M, Jiang X iScience. 2024; 27(6):110096.

PMID: 38957791 PMC: 11217617. DOI: 10.1016/j.isci.2024.110096.


Crosstalk among proximal tubular cells, macrophages, and fibroblasts in acute kidney injury: single-cell profiling from the perspective of ferroptosis.

Wang Y, Shen Z, Mo S, Zhang H, Chen J, Zhu C Hum Cell. 2024; 37(4):1039-1055.

PMID: 38753279 PMC: 11194220. DOI: 10.1007/s13577-024-01072-z.


Hyperspectral dark-field microscopy of human breast lumpectomy samples for tumor margin detection in breast-conserving surgery.

Hwang J, Cheney P, Kanick S, Le H, McClatchy 3rd D, Zhang H J Biomed Opt. 2024; 29(9):093503.

PMID: 38715717 PMC: 11075096. DOI: 10.1117/1.JBO.29.9.093503.


Dynamics of brain-muscle networks reveal effects of age and somatosensory function on gait.

Roeder L, Breakspear M, Kerr G, Boonstra T iScience. 2024; 27(3):109162.

PMID: 38414847 PMC: 10897916. DOI: 10.1016/j.isci.2024.109162.


References
1.
Kim P, Tidor B . Subsystem identification through dimensionality reduction of large-scale gene expression data. Genome Res. 2003; 13(7):1706-18. PMC: 403744. DOI: 10.1101/gr.903503. View

2.
Mukherjee S, Tamayo P, Rogers S, Rifkin R, Engle A, Campbell C . Estimating dataset size requirements for classifying DNA microarray data. J Comput Biol. 2003; 10(2):119-42. DOI: 10.1089/106652703321825928. View

3.
Gaujoux R, Seoighe C . A flexible R package for nonnegative matrix factorization. BMC Bioinformatics. 2010; 11:367. PMC: 2912887. DOI: 10.1186/1471-2105-11-367. View

4.
Kim H, Park H . Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics. 2007; 23(12):1495-502. DOI: 10.1093/bioinformatics/btm134. View

5.
Madeira S, Oliveira A . Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans Comput Biol Bioinform. 2006; 1(1):24-45. DOI: 10.1109/TCBB.2004.2. View