» Articles » PMID: 32269406

Factor-Adjusted Regularized Model Selection

Overview
Journal J Econom
Date 2020 Apr 10
PMID 32269406
Citations 5
Authors
Affiliations
Soon will be listed here.
Abstract

This paper studies model selection consistency for high dimensional sparse regression when data exhibits both cross-sectional and serial dependency. Most commonly-used model selection methods fail to consistently recover the true model when the covariates are highly correlated. Motivated by econometric and financial studies, we consider the case where covariate dependence can be reduced through the factor model, and propose a consistency strategy named Factor-Adjusted Regularized Model Selection (FarmSelect). By learning the latent factors and idiosyncratic components and using both of them as predictors, FarmSelect transforms the problem from model selection with highly correlated covariates to that with weakly correlated ones via lifting. Model selection consistency, as well as optimal rates of convergence, are obtained under mild conditions. Numerical studies demonstrate the nice finite sample performance in terms of both model selection and out-of-sample prediction. Moreover, our method is flexible in the sense that it pays no price for weakly correlated and uncorrelated cases. Our method is applicable to a wide range of high dimensional sparse regression problems. An R-package is also provided for implementation.

Citing Articles

Are Latent Factor Regression and Sparse Regression Adequate?.

Fan J, Lou Z, Yu M J Am Stat Assoc. 2024; 119(546):1076-1088.

PMID: 39268549 PMC: 11390100. DOI: 10.1080/01621459.2023.2169700.


Integrative Factor Regression and Its Inference for Multimodal Data Analysis.

Li Q, Li L J Am Stat Assoc. 2023; 117(540):2207-2221.

PMID: 36793370 PMC: 9928172. DOI: 10.1080/01621459.2021.1914635.


CANONICAL THRESHOLDING FOR NON-SPARSE HIGH-DIMENSIONAL LINEAR REGRESSION.

Silin I, Fan J Ann Stat. 2022; 50(1):460-486.

PMID: 36148472 PMC: 9491498. DOI: 10.1214/21-aos2116.


Bayesian Factor-adjusted Sparse Regression.

Fan J, Jiang B, Sun Q J Econom. 2022; 230(1):3-19.

PMID: 35754940 PMC: 9223477. DOI: 10.1016/j.jeconom.2020.06.012.


NOISY MATRIX COMPLETION: UNDERSTANDING STATISTICAL GUARANTEES FOR CONVEX RELAXATION VIA NONCONVEX OPTIMIZATION.

Chen Y, Chi Y, Fan J, Ma C, Yan Y SIAM J Optim. 2021; 30(4):3098-3121.

PMID: 34305368 PMC: 8300474. DOI: 10.1137/19m1290000.

References
1.
Friedman J, Hastie T, Tibshirani R . Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw. 2010; 33(1):1-22. PMC: 2929880. View

2.
Johnstone I, Lu A . On Consistency and Sparsity for Principal Components Analysis in High Dimensions. J Am Stat Assoc. 2010; 104(486):682-693. PMC: 2898454. DOI: 10.1198/jasa.2009.0121. View

3.
Wang W, Fan J . Asymptotics of empirical eigenstructure for high dimensional spiked covariance. Ann Stat. 2017; 45(3):1342-1374. PMC: 5563862. DOI: 10.1214/16-AOS1487. View

4.
Donoho D, Elad M . Optimally sparse representation in general (nonorthogonal) dictionaries via l minimization. Proc Natl Acad Sci U S A. 2006; 100(5):2197-202. PMC: 153464. DOI: 10.1073/pnas.0437847100. View

5.
Oberthuer A, Berthold F, Warnat P, Hero B, Kahlert Y, Spitz R . Customized oligonucleotide microarray gene expression-based classification of neuroblastoma patients outperforms current clinical risk stratification. J Clin Oncol. 2006; 24(31):5070-8. DOI: 10.1200/JCO.2006.06.1879. View