» Articles » PMID: 22820512

Fast and Accurate Genotype Imputation in Genome-wide Association Studies Through Pre-phasing

Overview
Journal Nat Genet
Specialty Genetics
Date 2012 Jul 24
PMID 22820512
Citations 1107
Authors
Affiliations
Soon will be listed here.
Abstract

The 1000 Genomes Project and disease-specific sequencing efforts are producing large collections of haplotypes that can be used as reference panels for genotype imputation in genome-wide association studies (GWAS). However, imputing from large reference panels with existing methods imposes a high computational burden. We introduce a strategy called 'pre-phasing' that maintains the accuracy of leading methods while reducing computational costs. We first statistically estimate the haplotypes for each individual within the GWAS sample (pre-phasing) and then impute missing genotypes into these estimated haplotypes. This reduces the computational cost because (i) the GWAS samples must be phased only once, whereas standard methods would implicitly repeat phasing with each reference panel update, and (ii) it is much faster to match a phased GWAS haplotype to one reference haplotype than to match two unphased GWAS genotypes to a pair of reference haplotypes. We implemented our approach in the MaCH and IMPUTE2 frameworks, and we tested it on data sets from the Wellcome Trust Case Control Consortium 2 (WTCCC2), the Genetic Association Information Network (GAIN), the Women's Health Initiative (WHI) and the 1000 Genomes Project. This strategy will be particularly valuable for repeated imputation as reference panels evolve.

Citing Articles

The importance of genotyping within the climate-smart plant breeding value chain - integrative tools for genetic enhancement programs.

Garcia-Oliveira A, Ortiz R, Sarsu F, Rasmussen S, Agre P, Asfaw A Front Plant Sci. 2025; 15:1518123.

PMID: 39980758 PMC: 11839310. DOI: 10.3389/fpls.2024.1518123.


Genome-wide association study of prostate-specific antigen levels in 392,522 men identifies new loci and improves prediction across ancestry groups.

Hoffmann T, Graff R, Madduri R, Rodriguez A, Cario C, Feng K Nat Genet. 2025; 57(2):334-344.

PMID: 39930085 PMC: 11821537. DOI: 10.1038/s41588-024-02068-z.


Fast and accurate imputation of genotypes from noisy low-coverage sequencing data in bi-parental populations.

Triay C, Boizet A, Fragoso C, Gkanogiannis A, Rami J, Lorieux M PLoS One. 2025; 20(1):e0314759.

PMID: 39883620 PMC: 11781708. DOI: 10.1371/journal.pone.0314759.


Genetic coupling of enhancer activity and connectivity in gene expression control.

Ray-Jones H, Sung C, Chan L, Haglund A, Artemov P, Della Rosa M Nat Commun. 2025; 16(1):970.

PMID: 39870618 PMC: 11772589. DOI: 10.1038/s41467-025-55900-3.


GWAS highlights the neuronal contribution to multiple sclerosis susceptibility.

De Jager P, Zeng L, Khan A, Lama T, Chitnis T, Weiner H Res Sq. 2025; .

PMID: 39866869 PMC: 11760239. DOI: 10.21203/rs.3.rs-5644532/v1.


References
1.
Chen W, Abecasis G . Family-based association tests for genomewide association scans. Am J Hum Genet. 2007; 81(5):913-26. PMC: 2265659. DOI: 10.1086/521580. View

2.
Nair R, Callis Duffin K, Helms C, Ding J, Stuart P, Goldgar D . Genome-wide scan reveals association of psoriasis with IL-23 and NF-kappaB pathways. Nat Genet. 2009; 41(2):199-204. PMC: 2745122. DOI: 10.1038/ng.311. View

3.
Kong A, Masson G, Frigge M, Gylfason A, Zusmanovich P, Thorleifsson G . Detection of sharing by descent, long-range phasing and haplotype imputation. Nat Genet. 2009; 40(9):1068-75. PMC: 4540081. DOI: 10.1038/ng.216. View

4.
Peltonen L, Altshuler D, de Bakker P, Deloukas P, Gabriel S, Gwilliam R . Integrating common and rare genetic variation in diverse human populations. Nature. 2010; 467(7311):52-8. PMC: 3173859. DOI: 10.1038/nature09298. View

5.
Li Y, Willer C, Sanna S, Abecasis G . Genotype imputation. Annu Rev Genomics Hum Genet. 2009; 10:387-406. PMC: 2925172. DOI: 10.1146/annurev.genom.9.081307.164242. View