» Articles » PMID: 17146048

Multivariate Regression Analysis of Distance Matrices for Testing Associations Between Gene Expression Patterns and Related Variables

Overview
Specialty Science
Date 2006 Dec 6
PMID 17146048
Citations 134
Authors
Affiliations
Soon will be listed here.
Abstract

A fundamental step in the analysis of gene expression and other high-dimensional genomic data is the calculation of the similarity or distance between pairs of individual samples in a study. If one has collected N total samples and assayed the expression level of G genes on those samples, then an N x N similarity matrix can be formed that reflects the correlation or similarity of the samples with respect to the expression values over the G genes. This matrix can then be examined for patterns via standard data reduction and cluster analysis techniques. We consider an alternative to conventional data reduction and cluster analyses of similarity matrices that is rooted in traditional linear models. This analysis method allows predictor variables collected on the samples to be related to variation in the pairwise similarity/distance values reflected in the matrix. The proposed multivariate method avoids the need for reducing the dimensions of a similarity matrix, can be used to assess relationships between the genes used to construct the matrix and additional information collected on the samples under study, and can be used to analyze individual genes or groups of genes identified in different ways. The technique can be used with any high-dimensional assay or data type and is ideally suited for testing subsets of genes defined by their participation in a biochemical pathway or other a priori grouping. We showcase the methodology using three published gene expression data sets.

Citing Articles

Pre-exposure of abundant species to disturbance improves resilience in microbial metacommunities.

Cairns J, Hogle S, Alitupa E, Mustonen V, Hiltunen T Nat Ecol Evol. 2025; 9(3):395-405.

PMID: 39825086 DOI: 10.1038/s41559-024-02624-0.


Soil bacterial and fungal diversity and composition respond differently to desertified system restoration.

Pan C, Yuan F, Liu Y, Yu X, Liu J PLoS One. 2025; 20(1):e0309188.

PMID: 39761240 PMC: 11703004. DOI: 10.1371/journal.pone.0309188.


Patterns of antibiotic resistance genes and virulence factor genes in the gut microbiome of patients with osteoarthritis and rheumatoid arthritis.

Guo Y, Feng H, Du L, Yu Z Front Microbiol. 2024; 15:1427313.

PMID: 39633808 PMC: 11615078. DOI: 10.3389/fmicb.2024.1427313.


Environmental Factors Drive the Biogeographic Pattern of Root Endophytic Fungal Diversity in the Arid Regions of Northwest China.

Guo S, Ye G, Liu W, Liu R, Liu Z, Ma Y J Fungi (Basel). 2024; 10(10).

PMID: 39452631 PMC: 11508200. DOI: 10.3390/jof10100679.


The aberrant tonsillar microbiota modulates autoimmune responses in rheumatoid arthritis.

Li J, Li S, Jin J, Guo R, Jin Y, Cao L JCI Insight. 2024; 9(18).

PMID: 39163137 PMC: 11457857. DOI: 10.1172/jci.insight.175916.


References
1.
Slonim D . From patterns to pathways: gene expression data analysis comes of age. Nat Genet. 2002; 32 Suppl:502-8. DOI: 10.1038/ng1033. View

2.
Kustra R, Shioda R, Zhu M . A factor analysis model for functional genomics. BMC Bioinformatics. 2006; 7:216. PMC: 1468435. DOI: 10.1186/1471-2105-7-216. View

3.
Storey J, Tibshirani R . Statistical significance for genomewide studies. Proc Natl Acad Sci U S A. 2003; 100(16):9440-5. PMC: 170937. DOI: 10.1073/pnas.1530509100. View

4.
Melchor J, Pawlak R, Strickland S . The tissue plasminogen activator-plasminogen proteolytic cascade accelerates amyloid-beta (Abeta) degradation and inhibits Abeta-induced neurodegeneration. J Neurosci. 2003; 23(26):8867-71. PMC: 6740393. View

5.
Hughes T, Hyun Y, Liberles D . Visualising very large phylogenetic trees in three dimensional hyperbolic space. BMC Bioinformatics. 2004; 5:48. PMC: 419335. DOI: 10.1186/1471-2105-5-48. View