» Articles » PMID: 19492015

Designing Genome-wide Association Studies: Sample Size, Power, Imputation, and the Choice of Genotyping Chip

Overview
Journal PLoS Genet
Specialty Genetics
Date 2009 Jun 4
PMID 19492015
Citations 282
Authors
Affiliations
Soon will be listed here.
Abstract

Genome-wide association studies are revolutionizing the search for the genes underlying human complex diseases. The main decisions to be made at the design stage of these studies are the choice of the commercial genotyping chip to be used and the numbers of case and control samples to be genotyped. The most common method of comparing different chips is using a measure of coverage, but this fails to properly account for the effects of sample size, the genetic model of the disease, and linkage disequilibrium between SNPs. In this paper, we argue that the statistical power to detect a causative variant should be the major criterion in study design. Because of the complicated pattern of linkage disequilibrium (LD) in the human genome, power cannot be calculated analytically and must instead be assessed by simulation. We describe in detail a method of simulating case-control samples at a set of linked SNPs that replicates the patterns of LD in human populations, and we used it to assess power for a comprehensive set of available genotyping chips. Our results allow us to compare the performance of the chips to detect variants with different effect sizes and allele frequencies, look at how power changes with sample size in different populations or when using multi-marker tags and genotype imputation approaches, and how performance compares to a hypothetical chip that contains every SNP in HapMap. A main conclusion of this study is that marked differences in genome coverage may not translate into appreciable differences in power and that, when taking budgetary considerations into account, the most powerful design may not always correspond to the chip with the highest coverage. We also show that genotype imputation can be used to boost the power of many chips up to the level obtained from a hypothetical "complete" chip containing all the SNPs in HapMap. Our results have been encapsulated into an R software package that allows users to design future association studies and our methods provide a framework with which new chip sets can be evaluated.

Citing Articles

SNP Genotype Imputation in Forensics-A Performance Study.

Tillmar A, Kling D Genes (Basel). 2024; 15(11).

PMID: 39596586 PMC: 11593911. DOI: 10.3390/genes15111386.


Meta-analysis of genome-wide association studies of stable warfarin dose in patients of African ancestry.

Asiimwe I, Blockman M, Cavallari L, Cohen K, Cupido C, Dandara C Blood Adv. 2024; 8(20):5248-5261.

PMID: 39163621 PMC: 11493193. DOI: 10.1182/bloodadvances.2024014227.


Leveraging sex-genetic interactions to understand brain disorders: recent advances and current gaps.

Neale N, Lona-Durazo F, Ryten M, Gagliano Taliun S Brain Commun. 2024; 6(3):fcae192.

PMID: 38894947 PMC: 11184352. DOI: 10.1093/braincomms/fcae192.


A resampling-based approach to share reference panels.

Cavinato T, Rubinacci S, Malaspinas A, Delaneau O Nat Comput Sci. 2024; 4(5):360-366.

PMID: 38745108 PMC: 11136649. DOI: 10.1038/s43588-024-00630-7.


Genetic Background of Blood β-Hydroxybutyrate Acid Concentrations in Early-Lactating Holstein Dairy Cows Based on Genome-Wide Association Analyses.

Wang Y, Wang Z, Liu W, Xie S, Ren X, Yan L Genes (Basel). 2024; 15(4).

PMID: 38674346 PMC: 11049649. DOI: 10.3390/genes15040412.


References
1.
Barrett J, Cardon L . Evaluating coverage of genome-wide association studies. Nat Genet. 2006; 38(6):659-62. DOI: 10.1038/ng1801. View

2.
Hao K, Schadt E, Storey J . Calibrating the performance of SNP arrays for whole-genome association studies. PLoS Genet. 2008; 4(6):e1000109. PMC: 2432039. DOI: 10.1371/journal.pgen.1000109. View

3.
Barrett J, Hansoul S, Nicolae D, Cho J, Duerr R, Rioux J . Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease. Nat Genet. 2008; 40(8):955-62. PMC: 2574810. DOI: 10.1038/ng.175. View

4.
. A haplotype map of the human genome. Nature. 2005; 437(7063):1299-320. PMC: 1880871. DOI: 10.1038/nature04226. View

5.
Scheet P, Stephens M . A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet. 2006; 78(4):629-44. PMC: 1424677. DOI: 10.1086/502802. View