» Articles » PMID: 29940862

A Correction for Sample Overlap in Genome-wide Association Studies in a Polygenic Pleiotropy-informed Framework

Overview
Journal BMC Genomics
Publisher Biomed Central
Specialty Genetics
Date 2018 Jun 27
PMID 29940862
Citations 27
Authors
Affiliations
Soon will be listed here.
Abstract

Background: There is considerable evidence that many complex traits have a partially shared genetic basis, termed pleiotropy. It is therefore useful to consider integrating genome-wide association study (GWAS) data across several traits, usually at the summary statistic level. A major practical challenge arises when these GWAS have overlapping subjects. This is particularly an issue when estimating pleiotropy using methods that condition the significance of one trait on the signficance of a second, such as the covariate-modulated false discovery rate (cmfdr).

Results: We propose a method for correcting for sample overlap at the summary statistic level. We quantify the expected amount of spurious correlation between the summary statistics from two GWAS due to sample overlap, and use this estimated correlation in a simple linear correction that adjusts the joint distribution of test statistics from the two GWAS. The correction is appropriate for GWAS with case-control or quantitative outcomes. Our simulations and data example show that without correcting for sample overlap, the cmfdr is not properly controlled, leading to an excessive number of false discoveries and an excessive false discovery proportion. Our correction for sample overlap is effective in that it restores proper control of the false discovery rate, at very little loss in power.

Conclusions: With our proposed correction, it is possible to integrate GWAS summary statistics with overlapping samples in a statistical framework that is dependent on the joint distribution of the two GWAS.

Citing Articles

Comparison of methods for building polygenic scores for diverse populations.

Gunn S, Wang X, Posner D, Cho K, Huffman J, Gaziano M HGG Adv. 2024; 6(1):100355.

PMID: 39323095 PMC: 11532986. DOI: 10.1016/j.xhgg.2024.100355.


Inflation of polygenic risk scores caused by sample overlap and relatedness: Examples of a major risk of bias.

Ellis C, Oliver K, Harris R, Ottman R, Scheffer I, Mefford H Am J Hum Genet. 2024; 111(9):1805-1809.

PMID: 39168121 PMC: 11393675. DOI: 10.1016/j.ajhg.2024.07.014.


Two-stage strategy using denoising autoencoders for robust reference-free genotype imputation with missing input genotypes.

Kojima K, Tadaka S, Okamura Y, Kinoshita K J Hum Genet. 2024; 69(10):511-518.

PMID: 38918526 PMC: 11422160. DOI: 10.1038/s10038-024-01261-6.


Epistasis and pleiotropy-induced variation for plant breeding.

Dwivedi S, Heslop-Harrison P, Amas J, Ortiz R, Edwards D Plant Biotechnol J. 2024; 22(10):2788-2807.

PMID: 38875130 PMC: 11536456. DOI: 10.1111/pbi.14405.


A cross-trait study of lung cancer and its related respiratory diseases based on large-scale exome sequencing population.

Jiang Y, Li H, Li Z, Du S, Zhang R, Zhao Y Transl Lung Cancer Res. 2024; 13(3):512-525.

PMID: 38601445 PMC: 11002514. DOI: 10.21037/tlcr-24-4.


References
1.
Evangelou E, Ioannidis J . Meta-analysis methods for genome-wide association studies and beyond. Nat Rev Genet. 2013; 14(6):379-89. DOI: 10.1038/nrg3472. View

2.
. Large-scale genome-wide association analysis of bipolar disorder identifies a new susceptibility locus near ODZ4. Nat Genet. 2011; 43(10):977-83. PMC: 3637176. DOI: 10.1038/ng.943. View

3.
Lin D, Sullivan P . Meta-analysis of genome-wide association studies with overlapping subjects. Am J Hum Genet. 2009; 85(6):862-72. PMC: 2790578. DOI: 10.1016/j.ajhg.2009.11.001. View

4.
Andreassen O, Zuber V, Thompson W, Schork A, Bettella F, Djurovic S . Shared common variants in prostate cancer and blood lipids. Int J Epidemiol. 2014; 43(4):1205-14. PMC: 4121563. DOI: 10.1093/ije/dyu090. View

5.
Chung D, Yang C, Li C, Gelernter J, Zhao H . GPA: a statistical approach to prioritizing GWAS results by integrating pleiotropy and annotation. PLoS Genet. 2014; 10(11):e1004787. PMC: 4230845. DOI: 10.1371/journal.pgen.1004787. View