» Articles » PMID: 20529933

Multivariate Multi-way Analysis of Multi-source Data

Overview
Journal Bioinformatics
Specialty Biology
Date 2010 Jun 10
PMID 20529933
Citations 11
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: Analysis of variance (ANOVA)-type methods are the default tool for the analysis of data with multiple covariates. These tools have been generalized to the multivariate analysis of high-throughput biological datasets, where the main challenge is the problem of small sample size and high dimensionality. However, the existing multi-way analysis methods are not designed for the currently increasingly important experiments where data is obtained from multiple sources. Common examples of such settings include integrated analysis of metabolic and gene expression profiles, or metabolic profiles from several tissues in our case, in a controlled multi-way experimental setup where disease status, medical treatment, gender and time-series are usual covariates.

Results: We extend the applicability area of multivariate, multi-way ANOVA-type methods to multi-source cases by introducing a novel Bayesian model. The method is capable of finding covariate-related dependencies between the sources. It assumes the measurements consist of groups of similarly behaving variables, and estimates the multivariate covariate effects and their interaction effects for the discovered groups of variables. In particular, the method partitions the effects to those shared between the sources and to source-specific ones. The method is specifically designed for datasets with small sample sizes and high dimensionality. We apply the method to a lipidomics dataset from a lung cancer study with two-way experimental setup, where measurements from several tissues with mostly distinct lipids have been taken. The method is also directly applicable to gene expression and proteomics.

Availability: An R-implementation is available at http://www.cis.hut.fi/projects/mi/software/multiWayCCA/.

Citing Articles

Extracellular Vesicle Protein Expression in Doped Bioactive Glasses: Further Insights Applying Anomaly Detection.

Nascimben M, Abreu H, Manfredi M, Cappellano G, Chiocchetti A, Rimondini L Int J Mol Sci. 2024; 25(6.

PMID: 38542533 PMC: 10971221. DOI: 10.3390/ijms25063560.


Bayesian predictive modeling of multi-source multi-way data.

Kim J, Sandri B, Rao R, Lock E Comput Stat Data Anal. 2023; 186.

PMID: 37274461 PMC: 10237362. DOI: 10.1016/j.csda.2023.107783.


ANOVA simultaneous component analysis: A tutorial review.

Bertinetto C, Engel J, Jansen J Anal Chim Acta X. 2021; 6:100061.

PMID: 33392497 PMC: 7772684. DOI: 10.1016/j.acax.2020.100061.


Adaptive Sparse Multiple Canonical Correlation Analysis With Application to Imaging (Epi)Genomics Study of Schizophrenia.

Hu W, Lin D, Cao S, Liu J, Chen J, Calhoun V IEEE Trans Biomed Eng. 2018; 65(2):390-399.

PMID: 29364120 PMC: 5826588. DOI: 10.1109/TBME.2017.2771483.


Structure-revealing data fusion.

Acar E, Papalexakis E, Gurdeniz G, Rasmussen M, Lawaetz A, Nilsson M BMC Bioinformatics. 2014; 15:239.

PMID: 25015427 PMC: 4117975. DOI: 10.1186/1471-2105-15-239.


References
1.
Kotronen A, Seppanen-Laakso T, Westerbacka J, Kiviluoto T, Arola J, Ruskeepaa A . Comparison of lipid and fatty acid composition of the liver, subcutaneous and intra-abdominal adipose tissue, and serum. Obesity (Silver Spring). 2009; 18(5):937-44. DOI: 10.1038/oby.2009.326. View

2.
Mehta D . Lysophosphatidylcholine: an enigmatic lysolipid. Am J Physiol Lung Cell Mol Physiol. 2005; 289(2):L174-5. DOI: 10.1152/ajplung.00165.2005. View

3.
Lucas J, Carvalho C, Chen J, Chi J, West M . Cross-study projections of genomic biomarkers: an evaluation in cancer genomics. PLoS One. 2009; 4(2):e4523. PMC: 2638006. DOI: 10.1371/journal.pone.0004523. View

4.
Summers S . Ceramides in insulin resistance and lipotoxicity. Prog Lipid Res. 2006; 45(1):42-72. DOI: 10.1016/j.plipres.2005.11.002. View

5.
Carvalho C, Chang J, Lucas J, Nevins J, Wang Q, West M . High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics. J Am Stat Assoc. 2011; 103(484):1438-1456. PMC: 3017385. DOI: 10.1198/016214508000000869. View