» Articles » PMID: 18229692

Extracting Gene Expression Profiles Common to Colon and Pancreatic Adenocarcinoma Using Simultaneous Nonnegative Matrix Factorization

Overview
Publisher World Scientific
Specialty Biology
Date 2008 Jan 31
PMID 18229692
Citations 12
Authors
Affiliations
Soon will be listed here.
Abstract

In this paper we introduce a clustering algorithm capable of simultaneously factorizing two distinct gene expression datasets with the aim of uncovering gene regulatory programs that are common to the two phenotypes. The siNMF algorithm simultaneously searches for two factorizations that share the same gene expression profiles. The two key ingredients of this algorithm are the nonnegativity constraint and the offset variables, which together ensure the sparseness of the factorizations. While cancer is a very heterogeneous disease, there is overwhelming recent evidence that the differences between cancer subtypes implicate entire pathways and biological processes involving large numbers of genes, rather than changes in single genes. We have applied our simultaneous factorization algorithm looking for gene expression profiles that are common between the more homogeneous pancreatic ductal adenocarcinoma (PDAC) and the more heterogeneous colon adenocarcinoma. The fact that the PDAC signature is active in a large fraction of colon adeocarcinoma suggests that the oncogenic mechanisms involved may be similar to those in PDAC, at least in this subset of colon samples. There are many approaches to uncovering common mechanisms involved in different phenotypes, but most are based on comparing gene lists. The approach presented in this paper additionally takes gene expression data into account and can thus be more sensitive.

Citing Articles

WormTensor: a clustering method for time-series whole-brain activity data from C. elegans.

Tsuyuzaki K, Yamamoto K, Toyoshima Y, Sato H, Kanamori M, Teramoto T BMC Bioinformatics. 2023; 24(1):254.

PMID: 37328814 PMC: 10273573. DOI: 10.1186/s12859-023-05230-2.


Elucidating transcriptomic profiles from single-cell RNA sequencing data using nature-inspired compressed sensing.

Yu Z, Bian C, Liu G, Zhang S, Wong K, Li X Brief Bioinform. 2021; 22(5).

PMID: 33855366 PMC: 8579163. DOI: 10.1093/bib/bbab125.


A hierarchical spatiotemporal analog forecasting model for count data.

McDermott P, Wikle C, Millspaugh J Ecol Evol. 2018; 8(1):790-800.

PMID: 29321914 PMC: 5756884. DOI: 10.1002/ece3.3621.


Novel Influences of IL-10 on CNS Inflammation Revealed by Integrated Analyses of Cytokine Networks and Microglial Morphology.

Anderson W, Greenhalgh A, Takwale A, David S, Vadigepalli R Front Cell Neurosci. 2017; 11:233.

PMID: 28855862 PMC: 5557777. DOI: 10.3389/fncel.2017.00233.


Integrated genomic analysis of biological gene sets with applications in lung cancer prognosis.

Chu S, Huang Y BMC Bioinformatics. 2017; 18(1):336.

PMID: 28697753 PMC: 5505153. DOI: 10.1186/s12859-017-1737-2.