» Articles » PMID: 36993707

Testing for Differences in Polygenic Scores in the Presence of Confounding

Overview
Journal bioRxiv
Date 2023 Mar 30
PMID 36993707
Authors
Affiliations
Soon will be listed here.
Abstract

Polygenic scores have become an important tool in human genetics, enabling the prediction of individuals' phenotypes from their genotypes. Understanding how the pattern of differences in polygenic score predictions across individuals intersects with variation in ancestry can provide insights into the evolutionary forces acting on the trait in question, and is important for understanding health disparities. However, because most polygenic scores are computed using effect estimates from population samples, they are susceptible to confounding by both genetic and environmental effects that are correlated with ancestry. The extent to which this confounding drives patterns in the distribution of polygenic scores depends on patterns of population structure in both the original estimation panel and in the prediction/test panel. Here, we use theory from population and statistical genetics, together with simulations, to study the procedure of testing for an association between polygenic scores and axes of ancestry variation in the presence of confounding. We use a general model of genetic relatedness to describe how confounding in the estimation panel biases the distribution of polygenic scores in a way that depends on the degree of overlap in population structure between panels. We then show how this confounding can bias tests for associations between polygenic scores and important axes of ancestry variation in the test panel. Specifically, for any given test, there exists a single axis of population structure in the GWAS panel that needs to be controlled for in order to protect the test. Based on this result, we propose a new approach for directly estimating this axis of population structure in the GWAS panel. We then use simulations to compare the performance of this approach to the standard approach in which the principal components of the GWAS panel genotypes are used to control for stratification.

Citing Articles

Bayesian approach to assessing population differences in genetic risk of disease with application to prostate cancer.

Timmins I, Dudbridge F PLoS Genet. 2024; 20(4):e1011212.

PMID: 38630784 PMC: 11023298. DOI: 10.1371/journal.pgen.1011212.

References
1.
Harpak A, Przeworski M . The evolution of group differences in changing environments. PLoS Biol. 2021; 19(1):e3001072. PMC: 7861633. DOI: 10.1371/journal.pbio.3001072. View

2.
Reich D, Thangaraj K, Patterson N, Price A, Singh L . Reconstructing Indian population history. Nature. 2009; 461(7263):489-94. PMC: 2842210. DOI: 10.1038/nature08365. View

3.
Novembre J, Barton N . Tread Lightly Interpreting Polygenic Tests of Selection. Genetics. 2018; 208(4):1351-1355. PMC: 5886544. DOI: 10.1534/genetics.118.300786. View

4.
Johnstone I, Paul D . PCA in High Dimensions: An orientation. Proc IEEE Inst Electr Electron Eng. 2018; 106(8):1277-1292. PMC: 6167023. DOI: 10.1109/JPROC.2018.2846730. View

5.
Martin A, Gignoux C, Walters R, Wojcik G, Neale B, Gravel S . Human Demographic History Impacts Genetic Risk Prediction across Diverse Populations. Am J Hum Genet. 2017; 100(4):635-649. PMC: 5384097. DOI: 10.1016/j.ajhg.2017.03.004. View