Factor-Adjusted Regularized Model Selection
Overview
Authors
Affiliations
This paper studies model selection consistency for high dimensional sparse regression when data exhibits both cross-sectional and serial dependency. Most commonly-used model selection methods fail to consistently recover the true model when the covariates are highly correlated. Motivated by econometric and financial studies, we consider the case where covariate dependence can be reduced through the factor model, and propose a consistency strategy named Factor-Adjusted Regularized Model Selection (FarmSelect). By learning the latent factors and idiosyncratic components and using both of them as predictors, FarmSelect transforms the problem from model selection with highly correlated covariates to that with weakly correlated ones via lifting. Model selection consistency, as well as optimal rates of convergence, are obtained under mild conditions. Numerical studies demonstrate the nice finite sample performance in terms of both model selection and out-of-sample prediction. Moreover, our method is flexible in the sense that it pays no price for weakly correlated and uncorrelated cases. Our method is applicable to a wide range of high dimensional sparse regression problems. An R-package is also provided for implementation.
Are Latent Factor Regression and Sparse Regression Adequate?.
Fan J, Lou Z, Yu M J Am Stat Assoc. 2024; 119(546):1076-1088.
PMID: 39268549 PMC: 11390100. DOI: 10.1080/01621459.2023.2169700.
Integrative Factor Regression and Its Inference for Multimodal Data Analysis.
Li Q, Li L J Am Stat Assoc. 2023; 117(540):2207-2221.
PMID: 36793370 PMC: 9928172. DOI: 10.1080/01621459.2021.1914635.
CANONICAL THRESHOLDING FOR NON-SPARSE HIGH-DIMENSIONAL LINEAR REGRESSION.
Silin I, Fan J Ann Stat. 2022; 50(1):460-486.
PMID: 36148472 PMC: 9491498. DOI: 10.1214/21-aos2116.
Bayesian Factor-adjusted Sparse Regression.
Fan J, Jiang B, Sun Q J Econom. 2022; 230(1):3-19.
PMID: 35754940 PMC: 9223477. DOI: 10.1016/j.jeconom.2020.06.012.
Chen Y, Chi Y, Fan J, Ma C, Yan Y SIAM J Optim. 2021; 30(4):3098-3121.
PMID: 34305368 PMC: 8300474. DOI: 10.1137/19m1290000.