» Articles » PMID: 37388906

Performance and Accuracy Evaluation of Reference Panels for Genotype Imputation in Sub-Saharan African Populations

Overview
Journal Cell Genom
Date 2023 Jun 30
PMID 37388906
Authors
Affiliations
Soon will be listed here.
Abstract

Based on evaluations of imputation performed on a genotype dataset consisting of about 11,000 sub-Saharan African (SSA) participants, we show Trans-Omics for Precision Medicine (TOPMed) and the African Genome Resource (AGR) to be currently the best panels for imputing SSA datasets. We report notable differences in the number of single-nucleotide polymorphisms (SNPs) that are imputed by different panels in datasets from East, West, and South Africa. Comparisons with a subset of 95 SSA high-coverage whole-genome sequences (WGSs) show that despite being about 20-fold smaller, the AGR imputed dataset has higher concordance with the WGSs. Moreover, the level of concordance between imputed and WGS datasets was strongly influenced by the extent of Khoe-San ancestry in a genome, highlighting the need for integration of not only geographically but also ancestrally diverse WGS data in reference panels for further improvement in imputation of SSA datasets. Approaches that integrate imputed data from different panels could also lead to better imputation.

Citing Articles

FLT1 and other candidate fetal haemoglobin modifying loci in sickle cell disease in African ancestries.

Wonkam A, Esoh K, Levine R, Ngo Bitoungui V, Mnika K, Nimmagadda N Nat Commun. 2025; 16(1):2092.

PMID: 40025045 PMC: 11873275. DOI: 10.1038/s41467-025-57413-5.


Cohort Profile: Africa Wits-INDEPTH partnership for Genomic studies (AWI-Gen) in four sub-Saharan African countries.

Tluway F, Agongo G, Baloyi V, Boua P, Kisiangani I, Lingani M Int J Epidemiol. 2025; 54(1).

PMID: 39899987 PMC: 11790221. DOI: 10.1093/ije/dyae173.


Type 1 diabetes genetic risk score variation across ancestries using whole genome sequencing and array-based approaches.

Arni A, Fraser D, Sharp S, Oram R, Johnson M, Weedon M Sci Rep. 2024; 14(1):31044.

PMID: 39730838 PMC: 11680773. DOI: 10.1038/s41598-024-82278-x.


Rare variant analyses in 51,256 type 2 diabetes cases and 370,487 controls reveal the pathogenicity spectrum of monogenic diabetes genes.

Huerta-Chagoya A, Schroeder P, Mandla R, Li J, Morris L, Vora M Nat Genet. 2024; 56(11):2370-2379.

PMID: 39379762 PMC: 11549050. DOI: 10.1038/s41588-024-01947-9.


A GWAS of ACE Inhibitor-Induced Angioedema in a South African Population.

Mugo J, Day C, Choudhury A, Deetlefs M, Freercks R, Geraty S medRxiv. 2024; .

PMID: 39314982 PMC: 11419215. DOI: 10.1101/2024.09.13.24313664.


References
1.
Li H, Durbin R . Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009; 25(14):1754-60. PMC: 2705234. DOI: 10.1093/bioinformatics/btp324. View

2.
Ramsay M, Crowther N, Tambo E, Agongo G, Baloyi V, Dikotope S . H3Africa AWI-Gen Collaborative Centre: a resource to study the interplay between genomic and environmental risk factors for cardiometabolic diseases in four sub-Saharan African countries. Glob Health Epidemiol Genom. 2017; 1:e20. PMC: 5732578. DOI: 10.1017/gheg.2016.17. View

3.
Das S, Abecasis G, Browning B . Genotype Imputation from Large Reference Panels. Annu Rev Genomics Hum Genet. 2018; 19:73-96. DOI: 10.1146/annurev-genom-083117-021602. View

4.
. The GenomeAsia 100K Project enables genetic discoveries across Asia. Nature. 2019; 576(7785):106-111. PMC: 7054211. DOI: 10.1038/s41586-019-1793-z. View

5.
Chang C, Chow C, Tellier L, Vattikuti S, Purcell S, Lee J . Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience. 2015; 4:7. PMC: 4342193. DOI: 10.1186/s13742-015-0047-8. View