» Articles » PMID: 31379929

Strategies for Obtaining and Pruning Imputed Whole-Genome Sequence Data for Genomic Prediction

Overview
Journal Front Genet
Date 2019 Aug 6
PMID 31379929
Citations 18
Authors
Affiliations
Soon will be listed here.
Abstract

Genomic prediction with imputed whole-genome sequencing (WGS) data is an attractive approach to improve predictive ability with low cost. However, high accuracy has not been realized using this method in livestock. In this study, we imputed 435 individuals from 600K single nucleotide polymorphism (SNP) chip data to WGS data using different reference panels. We also investigated the prediction accuracy of genomic best linear unbiased prediction (GBLUP) using imputed WGS data from different reference panels, linkage disequilibrium (LD)-based marker pruning, and pre-selected variants based on Genome-wide association society (GWAS) results. Results showed that the imputation accuracies from 600K to WGS data were 0.873 ± 0.038, 0.906 ± 0.036, and 0.979 ± 0.010 for the internal, external, and combined reference panels, respectively. In most traits of chickens, the prediction accuracy of imputed WGS data obtained from the internal reference panel was greater than or equal to that of the combined reference panel; the external reference panel had the lowest prediction accuracy. Compared with 600K chip data, GBLUP with imputed WGS data had only a small increase (1-3%) in prediction accuracy. Using only variants selected from imputed WGS data based on GWAS results resulted in almost no increase for most traits and even increased the bias of the regression coefficient. The impact of the degree of LD of selected and remaining variants on prediction accuracy was different. For average daily gain (ADG), residual feed intake (RFI), intestine length (IL), and body weight in 91 days (BW91), the accuracy of GBLUP increased as the degree of LD of selected variants decreased, but the opposite relationship occurred for the remaining variants. But for breast muscle weight (BMW) and average daily feed intake (ADFI), the accuracy of GBLUP increased as the degree of LD of selected variants increased, and the degree of LD of remaining variants had a small effect on prediction accuracy. Overall, the optimal imputation strategy to obtain WGS data for genomic prediction should consider the relationship between selected individuals and target population individuals to avoid heterogeneity of imputation. LD-based marker pruning can be used to improve the accuracy of genomic prediction using imputed WGS data.

Citing Articles

Improvement of the accuracy of breeding value prediction for egg production traits in Muscovy duck using low-coverage whole-genome sequence data.

Ye H, Ji C, Liu X, Bello S, Guo L, Fang X Poult Sci. 2025; 104(2):104812.

PMID: 39817986 PMC: 11786738. DOI: 10.1016/j.psj.2025.104812.


Explainable artificial intelligence for genotype-to-phenotype prediction in plant breeding: a case study with a dataset from an almond germplasm collection.

Novielli P, Romano D, Pavan S, Losciale P, Stellacci A, Diacono D Front Plant Sci. 2024; 15:1434229.

PMID: 39319003 PMC: 11420924. DOI: 10.3389/fpls.2024.1434229.


GWAS Enhances Genomic Prediction Accuracy of Caviar Yield, Caviar Color and Body Weight Traits in Sturgeons Using Whole-Genome Sequencing Data.

Song H, Dong T, Wang W, Yan X, Geng C, Bai S Int J Mol Sci. 2024; 25(17).

PMID: 39273703 PMC: 11395957. DOI: 10.3390/ijms25179756.


Landscape genomics reveals regions associated with adaptive phenotypic and genetic variation in Ethiopian indigenous chickens.

Kebede F, Derks M, Dessie T, Hanotte O, Barros C, Crooijmans R BMC Genomics. 2024; 25(1):284.

PMID: 38500079 PMC: 10946127. DOI: 10.1186/s12864-024-10193-6.


Genomic prediction based on preselected single-nucleotide polymorphisms from genome-wide association study and imputed whole-genome sequence data annotation for growth traits in Duroc pigs.

Zhang Y, Zhuang Z, Liu Y, Huang J, Luan M, Zhao X Evol Appl. 2024; 17(2):e13651.

PMID: 38362509 PMC: 10868536. DOI: 10.1111/eva.13651.


References
1.
Meuwissen T, Hayes B, Goddard M . Prediction of total genetic value using genome-wide dense marker maps. Genetics. 2001; 157(4):1819-29. PMC: 1461589. DOI: 10.1093/genetics/157.4.1819. View

2.
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira M, Bender D . PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007; 81(3):559-75. PMC: 1950838. DOI: 10.1086/519795. View

3.
Calus M, Meuwissen T, de Roos A, Veerkamp R . Accuracy of genomic selection using different methods to define haplotypes. Genetics. 2008; 178(1):553-61. PMC: 2206101. DOI: 10.1534/genetics.107.080838. View

4.
VanRaden P . Efficient methods to compute genomic predictions. J Dairy Sci. 2008; 91(11):4414-23. DOI: 10.3168/jds.2007-0980. View

5.
Habier D, Tetens J, Seefried F, Lichtner P, Thaller G . The impact of genetic relationship information on genomic breeding values in German Holstein cattle. Genet Sel Evol. 2010; 42:5. PMC: 2838754. DOI: 10.1186/1297-9686-42-5. View