» Articles » PMID: 35358296

Two-stage Linked Component Analysis for Joint Decomposition of Multiple Biologically Related Data Sets

Overview
Journal Biostatistics
Specialty Public Health
Date 2022 Mar 31
PMID 35358296
Authors
Affiliations
Soon will be listed here.
Abstract

Integrative analysis of multiple data sets has the potential of fully leveraging the vast amount of high throughput biological data being generated. In particular such analysis will be powerful in making inference from publicly available collections of genetic, transcriptomic and epigenetic data sets which are designed to study shared biological processes, but which vary in their target measurements, biological variation, unwanted noise, and batch variation. Thus, methods that enable the joint analysis of multiple data sets are needed to gain insights into shared biological processes that would otherwise be hidden by unwanted intra-data set variation. Here, we propose a method called two-stage linked component analysis (2s-LCA) to jointly decompose multiple biologically related experimental data sets with biological and technological relationships that can be structured into the decomposition. The consistency of the proposed method is established and its empirical performance is evaluated via simulation studies. We apply 2s-LCA to jointly analyze four data sets focused on human brain development and identify meaningful patterns of gene expression in human neurogenesis that have shared structure across these data sets.

Citing Articles

The use of prognostic models in allogeneic transplants: a perspective guide for clinicians and investigators.

Sorror M Blood. 2023; 141(18):2173-2186.

PMID: 36800564 PMC: 10273168. DOI: 10.1182/blood.2022017999.


Interpretive JIVE: Connections with CCA and an application to brain connectivity.

Murden R, Zhang Z, Guo Y, Risk B Front Neurosci. 2022; 16:969510.

PMID: 36312020 PMC: 9614436. DOI: 10.3389/fnins.2022.969510.

References
1.
Bien J, Bunea F, Xiao L . Convex Banding of the Covariance Matrix. J Am Stat Assoc. 2017; 111(514):834-845. PMC: 5199058. DOI: 10.1080/01621459.2015.1058265. View

2.
Lock E, Park J, Hoadley K . BIDIMENSIONAL LINKED MATRIX FACTORIZATION FOR PAN-OMICS PAN-CANCER ANALYSIS. Ann Appl Stat. 2022; 16(1):193-215. PMC: 9060567. DOI: 10.1214/21-AOAS1495. View

3.
Li G, Liu X, Chen K . Integrative multi-view regression: Bridging group-sparse and low-rank models. Biometrics. 2018; 75(2):593-602. PMC: 6849205. DOI: 10.1111/biom.13006. View

4.
Sharma G, Colantuoni C, Goff L, Fertig E, Stein-OBrien G . projectR: an R/Bioconductor package for transfer learning via PCA, NMF, correlation and clustering. Bioinformatics. 2020; 36(11):3592-3593. PMC: 7267840. DOI: 10.1093/bioinformatics/btaa183. View

5.
Min E, Long Q . Sparse multiple co-Inertia analysis with application to integrative analysis of multi -Omics data. BMC Bioinformatics. 2020; 21(1):141. PMC: 7157996. DOI: 10.1186/s12859-020-3455-4. View