» Articles » PMID: 33746241

SOFAR: Large-Scale Association Network Learning

Overview
Date 2021 Mar 22
PMID 33746241
Citations 1
Authors
Affiliations
Soon will be listed here.
Abstract

Many modern big data applications feature large scale in both numbers of responses and predictors. Better statistical efficiency and scientific insights can be enabled by understanding the large-scale response-predictor association network structures via layers of sparse latent factors ranked by importance. Yet sparsity and orthogonality have been two largely incompatible goals. To accommodate both features, in this paper we suggest the method of sparse orthogonal factor regression (SOFAR) via the sparse singular value decomposition with orthogonality constrained optimization to learn the underlying association networks, with broad applications to both unsupervised and supervised learning tasks such as biclustering with sparse singular value decomposition, sparse principal component analysis, sparse factor analysis, and spare vector autoregression analysis. Exploiting the framework of convexity-assisted nonconvex optimization, we derive nonasymptotic error bounds for the suggested procedure characterizing the theoretical advantages. The statistical guarantees are powered by an efficient SOFAR algorithm with convergence property. Both computational and theoretical advantages of our procedure are demonstrated with several simulations and real data examples.

Citing Articles

DeepLINK: Deep learning inference using knockoffs with applications to genomics.

Zhu Z, Fan Y, Kong Y, Lv J, Sun F Proc Natl Acad Sci U S A. 2021; 118(36).

PMID: 34480002 PMC: 8433583. DOI: 10.1073/pnas.2104683118.

References
1.
Peng J, Zhu J, Bergamaschi A, Han W, Noh D, Pollack J . Regularized Multivariate Regression for Identifying Master Predictors with Application to Integrative Genomics Study of Breast Cancer. Ann Appl Stat. 2014; 4(1):53-77. PMC: 3905690. DOI: 10.1214/09-AOAS271SUPP. View

2.
Yin J, Li H . A SPARSE CONDITIONAL GAUSSIAN GRAPHICAL MODEL FOR ANALYSIS OF GENETICAL GENOMICS DATA. Ann Appl Stat. 2012; 5(4):2630-2650. PMC: 3419502. DOI: 10.1214/11-AOAS494. View

3.
Gustin M, Albertyn J, Alexander M, Davenport K . MAP kinase pathways in the yeast Saccharomyces cerevisiae. Microbiol Mol Biol Rev. 1998; 62(4):1264-300. PMC: 98946. DOI: 10.1128/MMBR.62.4.1264-1300.1998. View

4.
Lee M, Shen H, Huang J, Marron J . Biclustering via sparse singular value decomposition. Biometrics. 2010; 66(4):1087-95. DOI: 10.1111/j.1541-0420.2010.01392.x. View

5.
Guo J, James G, Levina E, Michailidis G, Zhu J . Principal Component Analysis With Sparse Fused Loadings. J Comput Graph Stat. 2015; 19(4):930-946. PMC: 4394907. DOI: 10.1198/jcgs.2010.08127. View