» Articles » PMID: 22505788

Tweedie's Formula and Selection Bias

Overview
Journal J Am Stat Assoc
Specialty Public Health
Date 2012 Apr 17
PMID 22505788
Citations 19
Authors
Affiliations
Soon will be listed here.
Abstract

We suppose that the statistician observes some large number of estimates z(i), each with its own unobserved expectation parameter μ(i). The largest few of the z(i)'s are likely to substantially overestimate their corresponding μ(i)'s, this being an example of selection bias, or regression to the mean. Tweedie's formula, first reported by Robbins in 1956, offers a simple empirical Bayes approach for correcting selection bias. This paper investigates its merits and limitations. In addition to the methodology, Tweedie's formula raises more general questions concerning empirical Bayes theory, discussed here as "relevance" and "empirical Bayes information." There is a close connection between applications of the formula and James-Stein estimation.

Citing Articles

Mendelian randomization and Bayesian model averaging of autoimmune diseases and Long COVID.

Feng J, Chen J, Li X, Ren X, Chen J, Li Z Front Genet. 2024; 15:1383162.

PMID: 39005628 PMC: 11240141. DOI: 10.3389/fgene.2024.1383162.


An Empirical Bayes Approach to Shrinkage Estimation on the Manifold of Symmetric Positive-Definite Matrices.

Yang C, Doss H, Vemuri B J Am Stat Assoc. 2024; 119(545):259-272.

PMID: 38590837 PMC: 11000275. DOI: 10.1080/01621459.2022.2110877.


A genome-wide association study of Chinese and English language phenotypes in Hong Kong Chinese children.

Lin Y, Shi Y, Zhang R, Xue X, Rao S, Yin L NPJ Sci Learn. 2024; 9(1):26.

PMID: 38538593 PMC: 10973362. DOI: 10.1038/s41539-024-00229-7.


SumVg: Total Heritability Explained by All Variants in Genome-Wide Association Studies Based on Summary Statistics with Standard Error Estimates.

So H, Xue X, Ma Z, Sham P Int J Mol Sci. 2024; 25(2).

PMID: 38279346 PMC: 10816209. DOI: 10.3390/ijms25021347.


Assessing the Most Vulnerable Subgroup to Type II Diabetes Associated with Statin Usage: Evidence from Electronic Health Record Data.

Guo X, Wei W, Liu M, Cai T, Wu C, Wang J J Am Stat Assoc. 2024; 118(543):1488-1499.

PMID: 38223220 PMC: 10786632. DOI: 10.1080/01621459.2022.2157727.


References
1.
Sun L, Bull S . Reduction of selection bias in genomewide studies by resampling. Genet Epidemiol. 2005; 28(4):352-67. DOI: 10.1002/gepi.20068. View

2.
Efron B . Empirical Bayes Estimates for Large-Scale Prediction Problems. J Am Stat Assoc. 2010; 104(487):1015-1028. PMC: 2844005. DOI: 10.1198/jasa.2009.tm08523. View

3.
Zhong H, Prentice R . Bias-reduced estimators and confidence intervals for odds ratios in genome-wide association studies. Biostatistics. 2008; 9(4):621-34. PMC: 2536726. DOI: 10.1093/biostatistics/kxn001. View

4.
Efron B . Correlated z-values and the accuracy of large-scale statistical estimates. J Am Stat Assoc. 2010; 105(491):1042-1055. PMC: 2967047. DOI: 10.1198/jasa.2010.tm09129. View

5.
Zollner S, Pritchard J . Overcoming the winner's curse: estimating penetrance parameters from case-control data. Am J Hum Genet. 2007; 80(4):605-15. PMC: 1852705. DOI: 10.1086/512821. View