» Articles » PMID: 27694958

Reference-based Phasing Using the Haplotype Reference Consortium Panel

Overview
Journal Nat Genet
Specialty Genetics
Date 2016 Oct 4
PMID 27694958
Citations 893
Authors
Affiliations
Soon will be listed here.
Abstract

Haplotype phasing is a fundamental problem in medical and population genetics. Phasing is generally performed via statistical phasing in a genotyped cohort, an approach that can yield high accuracy in very large cohorts but attains lower accuracy in smaller cohorts. Here we instead explore the paradigm of reference-based phasing. We introduce a new phasing algorithm, Eagle2, that attains high accuracy across a broad range of cohort sizes by efficiently leveraging information from large external reference panels (such as the Haplotype Reference Consortium; HRC) using a new data structure based on the positional Burrows-Wheeler transform. We demonstrate that Eagle2 attains a ∼20× speedup and ∼10% increase in accuracy compared to reference-based phasing using SHAPEIT2. On European-ancestry samples, Eagle2 with the HRC panel achieves >2× the accuracy of 1000 Genomes-based phasing. Eagle2 is open source and freely available for HRC-based phasing via the Sanger Imputation Service and the Michigan Imputation Server.

Citing Articles

Potentially causal associations between placental DNA methylation and schizophrenia and other neuropsychiatric disorders.

Cilleros-Portet A, Lesseur C, Mari S, Cosin-Tomas M, Lozano M, Irizar A Nat Commun. 2025; 16(1):2431.

PMID: 40087310 DOI: 10.1038/s41467-025-57760-3.


Ensemble-learning approach improves fracture prediction using genomic and phenotypic data.

Wu Q, Jung J Osteoporos Int. 2025; .

PMID: 40053072 DOI: 10.1007/s00198-025-07437-w.


Using genotype imputation to integrate Canola populations for genome-wide association and genomic prediction of blackleg resistance.

Zhao H, MacLeod I, Keeble-Gagnere G, Barbulescu D, Tibbits J, Kaur S BMC Genomics. 2025; 26(1):215.

PMID: 40038585 PMC: 11877698. DOI: 10.1186/s12864-025-11250-4.


ralphi: a deep reinforcement learning framework for haplotype assembly.

Battistella E, Maheshwari A, Ekim B, Berger B, Popic V bioRxiv. 2025; .

PMID: 40027721 PMC: 11870604. DOI: 10.1101/2025.02.17.638151.


Epidemiology and genetic determination of measures of peripheral vascular health in the Long Life Family Study.

Fricke D, Cvejkus R, Barinas-Mitchell E, Feitosa M, Murabito J, Acharya S Aging (Albany NY). 2025; 17(2):464-481.

PMID: 40013929 PMC: 11892930. DOI: 10.18632/aging.206204.


References
1.
Sharp K, Kretzschmar W, Delaneau O, Marchini J . Phasing for medical sequencing using rare variants and large haplotype reference panels. Bioinformatics. 2016; 32(13):1974-80. PMC: 4920110. DOI: 10.1093/bioinformatics/btw065. View

2.
Williams A, Patterson N, Glessner J, Hakonarson H, Reich D . Phasing of many thousands of genotyped samples. Am J Hum Genet. 2012; 91(2):238-51. PMC: 3415548. DOI: 10.1016/j.ajhg.2012.06.013. View

3.
OConnell J, Sharp K, Shrine N, Wain L, Hall I, Tobin M . Haplotype estimation for biobank-scale data sets. Nat Genet. 2016; 48(7):817-20. PMC: 4926957. DOI: 10.1038/ng.3583. View

4.
Durbin R . Efficient haplotype matching and storage using the positional Burrows-Wheeler transform (PBWT). Bioinformatics. 2014; 30(9):1266-72. PMC: 3998136. DOI: 10.1093/bioinformatics/btu014. View

5.
Browning B, Browning S . A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am J Hum Genet. 2009; 84(2):210-23. PMC: 2668004. DOI: 10.1016/j.ajhg.2009.01.005. View