» Articles » PMID: 32703240

Imputeqc: an R Package for Assessing Imputation Quality of Genotypes and Optimizing Imputation Parameters

Overview
Publisher Biomed Central
Specialty Biology
Date 2020 Jul 25
PMID 32703240
Citations 7
Authors
Affiliations
Soon will be listed here.
Abstract

Background: The imputation of genotypes increases the power of genome-wide association studies. However, the imputation quality should be assessed in each particular case. Nevertheless, not all imputation softwares control the error of output, e.g., the last release of fastPHASE program (1.4.8) lacks such an option. In this particular software there is also an uncertainty in choosing the model parameters. fastPHASE is based on haplotype clusters, which size should be set a priori. The parameter influences the results of imputation and downstream analysis.

Results: We present a software toolkit imputeqc to assess the imputation quality and/or to choose the model parameters for imputation. We demonstrate the efficacy of toolkit for evaluation of imputations made with both fastPHASE and BEAGLE software for HapMap and 1000 Genomes data. The discordance of genotypes received correlated well in both methods. Using imputeqc, we also shown how to choose the optimal number of haplotype clusters and expectation-maximization cycles for fastPHASE program. The found number of haplotype clusters of 25 was further applied for hapFLK testing that revealed signatures of selection at LCT region on chromosome 2. We also demonstrated how to decrease the computational time in the case of hapFLK testing from 3 days to 20 h.

Conclusions: The toolkit is implemented as an R package imputeqc and command line scripts. The code is freely available at https://github.com/inzilico/imputeqc under the MIT license.

Citing Articles

Genome-wide scans for signatures of selection in North African sheep reveals differentially selected regions between fat- and thin-tailed breeds.

Ben-Jemaa S, Yahyaoui G, Kdidi S, Najjari A, Lenstra J, Mastrangelo S Anim Genet. 2024; 56(1):e13487.

PMID: 39573836 PMC: 11653233. DOI: 10.1111/age.13487.


Whole genome sequencing reveals signals of adaptive admixture in Creole cattle.

Ben-Jemaa S, Adam G, Boussaha M, Bardou P, Klopp C, Mandonnet N Sci Rep. 2023; 13(1):12155.

PMID: 37500674 PMC: 10374910. DOI: 10.1038/s41598-023-38774-7.


A comprehensive analysis of the genetic diversity and environmental adaptability in worldwide Merino and Merino-derived sheep breeds.

Ceccobelli S, Landi V, Senczuk G, Mastrangelo S, Sardina M, Ben-Jemaa S Genet Sel Evol. 2023; 55(1):24.

PMID: 37013467 PMC: 10069132. DOI: 10.1186/s12711-023-00797-z.


Genome-wide mapping of signatures of selection using a high-density array identified candidate genes for growth traits and local adaptation in chickens.

Mastrangelo S, Ben-Jemaa S, Perini F, Cendron F, Biscarini F, Lasagna E Genet Sel Evol. 2023; 55(1):20.

PMID: 36959552 PMC: 10035218. DOI: 10.1186/s12711-023-00790-6.


Selection signature analysis and genome-wide divergence of South African Merino breeds from their founders.

Dzomba E, Van Der Nest M, Mthembu J, Soma P, Snyman M, Chimonyo M Front Genet. 2023; 13:932272.

PMID: 36685923 PMC: 9847500. DOI: 10.3389/fgene.2022.932272.


References
1.
Fariello M, Boitard S, Naya H, SanCristobal M, Servin B . Detecting signatures of selection through haplotype differentiation among hierarchically structured populations. Genetics. 2013; 193(3):929-41. PMC: 3584007. DOI: 10.1534/genetics.112.147231. View

2.
Abecasis G, Altshuler D, Auton A, Brooks L, Durbin R, Gibbs R . A map of human genome variation from population-scale sequencing. Nature. 2010; 467(7319):1061-73. PMC: 3042601. DOI: 10.1038/nature09534. View

3.
Peltonen L, Altshuler D, de Bakker P, Deloukas P, Gabriel S, Gwilliam R . Integrating common and rare genetic variation in diverse human populations. Nature. 2010; 467(7311):52-8. PMC: 3173859. DOI: 10.1038/nature09298. View

4.
Browning S, Weir B . Population structure with localized haplotype clusters. Genetics. 2010; 185(4):1337-44. PMC: 2927760. DOI: 10.1534/genetics.110.116681. View

5.
Howie B, Donnelly P, Marchini J . A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 2009; 5(6):e1000529. PMC: 2689936. DOI: 10.1371/journal.pgen.1000529. View