» Articles » PMID: 25928499

Imputation-based Population Genetics Analysis of Plasmodium Falciparum Malaria Parasites

Overview
Journal PLoS Genet
Specialty Genetics
Date 2015 May 1
PMID 25928499
Citations 24
Authors
Affiliations
Soon will be listed here.
Abstract

Whole-genome sequencing technologies are being increasingly applied to Plasmodium falciparum clinical isolates to identify genetic determinants of malaria pathogenesis. However, genome-wide discovery methods, such as haplotype scans for signatures of natural selection, are hindered by missing genotypes in sequence data. Poor correlation between single nucleotide polymorphisms (SNPs) in the P. falciparum genome complicates efforts to apply established missing-genotype imputation methods that leverage off patterns of linkage disequilibrium (LD). The accuracy of state-of-the-art, LD-based imputation methods (IMPUTE, Beagle) was assessed by measuring allelic r2 for 459 P. falciparum samples from malaria patients in 4 countries: Thailand, Cambodia, Gambia, and Malawi. In restricting our analysis to 86 k high-quality SNPs across the populations, we found that the complete-case analysis was restricted to 21k SNPs (24.5%), despite no single SNP having more than 10% missing genotypes. The accuracy of Beagle in filling in missing genotypes was consistently high across all populations (allelic r2, 0.87-0.96), but the performance of IMPUTE was mixed (allelic r2, 0.34-0.99) depending on reference haplotypes and population. Positive selection analysis using Beagle-imputed haplotypes identified loci involved in resistance to chloroquine (crt) in Thailand, Cambodia, and Gambia, sulfadoxine-pyrimethamine (dhfr, dhps) in Cambodia, and artemisinin (kelch13) in Cambodia. Tajima's D-based analysis identified genes under balancing selection that encode well-characterized vaccine candidates: apical merozoite antigen 1 (ama1) and merozoite surface protein 1 (msp1). In contrast, the complete-case analysis failed to identify any well-validated drug resistance or candidate vaccine loci, except kelch13. In a setting of low LD and modest levels of missing genotypes, using Beagle to impute P. falciparum genotypes is a viable strategy for conducting accurate large-scale population genetics and association analyses, and supporting global surveillance for drug resistance markers and candidate vaccine antigens.

Citing Articles

Rapid profiling of Plasmodium parasites from genome sequences to assist malaria control.

Phelan J, Turkiewicz A, Manko E, Thorpe J, Vanheer L, van de Vegte-Bolmer M Genome Med. 2023; 15(1):96.

PMID: 37950308 PMC: 10636944. DOI: 10.1186/s13073-023-01247-7.


Geographical classification of malaria parasites through applying machine learning to whole genome sequence data.

Deelder W, Manko E, Phelan J, Campino S, Palla L, Clark T Sci Rep. 2022; 12(1):21150.

PMID: 36476815 PMC: 9729610. DOI: 10.1038/s41598-022-25568-6.


Genome-wide SNP analysis of shows differentiation at drug-resistance-associated loci among malaria transmission settings in southern Mali.

Coulibaly A, Diop M, Kone A, Dara A, Ouattara A, Mulder N Front Genet. 2022; 13:943445.

PMID: 36267403 PMC: 9576839. DOI: 10.3389/fgene.2022.943445.


Characterizing the genomic variation and population dynamics of Plasmodium falciparum malaria parasites in and around Lake Victoria, Kenya.

Osborne A, Manko E, Takeda M, Kaneko A, Kagaya W, Chan C Sci Rep. 2021; 11(1):19809.

PMID: 34615917 PMC: 8494747. DOI: 10.1038/s41598-021-99192-1.


Using deep learning to identify recent positive selection in malaria parasite sequence data.

Deelder W, Diez Benavente E, Phelan J, Manko E, Campino S, Palla L Malar J. 2021; 20(1):270.

PMID: 34126997 PMC: 8201710. DOI: 10.1186/s12936-021-03788-x.


References
1.
Jiang H, Li N, Gopalan V, Zilversmit M, Varma S, Nagarajan V . High recombination rates and hotspots in a Plasmodium falciparum genetic cross. Genome Biol. 2011; 12(4):R33. PMC: 3218859. DOI: 10.1186/gb-2011-12-4-r33. View

2.
Hudson R . Two-locus sampling distributions and their application. Genetics. 2002; 159(4):1805-17. PMC: 1461925. DOI: 10.1093/genetics/159.4.1805. View

3.
Chan A, Jenkins P, Song Y . Genome-wide fine-scale recombination rate variation in Drosophila melanogaster. PLoS Genet. 2013; 8(12):e1003090. PMC: 3527307. DOI: 10.1371/journal.pgen.1003090. View

4.
Li N, Stephens M . Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics. 2004; 165(4):2213-33. PMC: 1462870. DOI: 10.1093/genetics/165.4.2213. View

5.
Browning B, Browning S . A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am J Hum Genet. 2009; 84(2):210-23. PMC: 2668004. DOI: 10.1016/j.ajhg.2009.01.005. View