» Articles » PMID: 29104447

EigenPrism: Inference for High Dimensional Signal-to-noise Ratios

Overview
Date 2017 Nov 7
PMID 29104447
Citations 8
Authors
Affiliations
Soon will be listed here.
Abstract

Consider the following three important problems in statistical inference, namely, constructing confidence intervals for (1) the error of a high-dimensional ( > ) regression estimator, (2) the linear regression noise level, and (3) the genetic signal-to-noise ratio of a continuous-valued trait (related to the heritability). All three problems turn out to be closely related to the little-studied problem of performing inference on the [Formula: see text]-norm of the signal in high-dimensional linear regression. We derive a novel procedure for this, which is asymptotically correct when the covariates are multivariate Gaussian and produces valid confidence intervals in finite samples as well. The procedure, called , is computationally fast and makes no assumptions on coefficient sparsity or knowledge of the noise level. We investigate the width of the EigenPrism confidence intervals, including a comparison with a Bayesian setting in which our interval is just 5% wider than the Bayes credible interval. We are then able to unify the three aforementioned problems by showing that the EigenPrism procedure with only minor modifications is able to make important contributions to all three. We also investigate the robustness of coverage and find that the method applies in practice and in finite samples much more widely than just the case of multivariate Gaussian covariates. Finally, we apply EigenPrism to a genetic dataset to estimate the genetic signal-to-noise ratio for a number of continuous phenotypes.

Citing Articles

A Regression-based Approach to Robust Estimation and Inference for Genetic Covariance.

Wang J, Li S, Li H J Am Stat Assoc. 2025; 119(548):2585-2597.

PMID: 39931231 PMC: 11810120. DOI: 10.1080/01621459.2023.2261669.


Optimal Estimation of Genetic Relatedness in High-dimensional Linear Models.

Guo Z, Wang W, Cai T, Li H J Am Stat Assoc. 2024; 114(525):358-369.

PMID: 38434789 PMC: 10907007. DOI: 10.1080/01621459.2017.1407774.


Inferring the heritability of bacterial traits in the era of machine learning.

Mai T, Lees J, Gladstone R, Corander J Bioinform Adv. 2023; 3(1):vbad027.

PMID: 36974068 PMC: 10039732. DOI: 10.1093/bioadv/vbad027.


Testability of high-dimensional linear models with nonsparse structures.

Bradic J, Fan J, Zhu Y Ann Stat. 2022; 50(2):615-639.

PMID: 35814863 PMC: 9266975. DOI: 10.1214/19-aos1932.


Statistical Methods for Assessing the Explained Variation of a Health Outcome by a Mixture of Exposures.

Chen H, Li H, Argos M, Persky V, Turyk M Int J Environ Res Public Health. 2022; 19(5).

PMID: 35270383 PMC: 8910055. DOI: 10.3390/ijerph19052693.


References
1.
Price A, Patterson N, Plenge R, Weinblatt M, Shadick N, Reich D . Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006; 38(8):904-9. DOI: 10.1038/ng1847. View

2.
Yang J, Benyamin B, McEvoy B, Gordon S, Henders A, Nyholt D . Common SNPs explain a large proportion of the heritability for human height. Nat Genet. 2010; 42(7):565-9. PMC: 3232052. DOI: 10.1038/ng.608. View

3.
Abecasis G, Auton A, Brooks L, DePristo M, Durbin R, Handsaker R . An integrated map of genetic variation from 1,092 human genomes. Nature. 2012; 491(7422):56-65. PMC: 3498066. DOI: 10.1038/nature11632. View

4.
Visscher P, Hill W, Wray N . Heritability in the genomics era--concepts and misconceptions. Nat Rev Genet. 2008; 9(4):255-66. DOI: 10.1038/nrg2322. View

5.
Fan J, Guo S, Hao N . Variance estimation using refitted cross-validation in ultrahigh dimensional regression. J R Stat Soc Series B Stat Methodol. 2012; 74(1):37-65. PMC: 3271712. DOI: 10.1111/j.1467-9868.2011.01005.x. View