» Articles » PMID: 30963515

Comparing Assignment-based Approaches to Breed Identification Within a Large Set of Horses

Overview
Journal J Appl Genet
Publisher Springer
Specialty Genetics
Date 2019 Apr 10
PMID 30963515
Citations 3
Authors
Affiliations
Soon will be listed here.
Abstract

Considering the extensive data sets and statistical techniques, animal breeding embodies a branch of machine learning that has a constantly increasing impact on breeding. In our study, information regarding the potential of machine learning and data mining within a large set of horses and breeds is presented. The individual assignment methods and factors influencing the success rate of the procedure are compared at the Czech population scale. The fixation index values ranged from 0.057 (HMS1) to 0.144 (HTG6), and the overall genetic differentiation amounted to 8.9% among the breeds. The highest genetic divergence (F = 0.378) was established between the Friesian and Equus przewalskii; the highest degree of gene migration was obtained between the Czech and Bavarian Warmblood (N = 14,302); and the overall global heterozygote deficit across the populations was 10.4%. The eight standard methods (Bayesian, frequency, and distance) using GeneClass software and almost all mainstream classification algorithms (Bayes Net, Naive Bayes, IB1, IB5, KStar, JRip, J48, Random Forest, Random Tree, PART, MLP, and SVM) from the WEKA machine learning workbench were compared by utilizing 314,874 real allelic data sets. The Bayesian method (GeneClass, 89.9%) and Bayesian network algorithm (WEKA, 84.8%) outperformed the other techniques. The breed genomic prediction accuracy reached the highest value in the cold-blooded horses. The overall proportion of individuals correctly assigned to a population depended mainly on the breed number and genetic divergence. These statistical tools could be used to assess breed traceability systems, and they exhibit the potential to assist managers in decision-making as regards breeding and registration.

Citing Articles

The use of SNP markers for cattle breed identification.

Jasielczuk I, Gurgul A, Szmatola T, Radko A, Majewska A, Sosin E J Appl Genet. 2024; 65(3):575-589.

PMID: 38568414 DOI: 10.1007/s13353-024-00857-0.


Genome-Enabled Prediction Methods Based on Machine Learning.

Reinoso-Pelaez E, Gianola D, Gonzalez-Recio O Methods Mol Biol. 2022; 2467:189-218.

PMID: 35451777 DOI: 10.1007/978-1-0716-2205-6_7.


Cost-effective horse breeding in the Republic of Bashkortostan, Russia.

Askarov A, Kuznetsova A, Gusmanov R, Askarova A, Kovshov V Vet World. 2020; 13(10):2039-2045.

PMID: 33281335 PMC: 7704310. DOI: 10.14202/vetworld.2020.2039-2045.

References
1.
Cornuet J, Piry S, Luikart G, Estoup A, Solignac M . New methods employing multilocus genotypes to select or exclude populations as origins of individuals. Genetics. 1999; 153(4):1989-2000. PMC: 1460843. DOI: 10.1093/genetics/153.4.1989. View

2.
Bjornstad G, Roed K . Evaluation of factors affecting individual assignment precision using microsatellite data from horse breeds and simulated breed crosses. Anim Genet. 2002; 33(4):264-70. DOI: 10.1046/j.1365-2052.2002.00868.x. View

3.
Koskinen M . Individual assignment using microsatellite DNA reveals unambiguous breed identification in the domestic dog. Anim Genet. 2003; 34(4):297-301. DOI: 10.1046/j.1365-2052.2003.01005.x. View

4.
Piry S, Alapetite A, Cornuet J, Paetkau D, Baudouin L, Estoup A . GENECLASS2: a software for genetic assignment and first-generation migrant detection. J Hered. 2004; 95(6):536-9. DOI: 10.1093/jhered/esh074. View

5.
Liu K, Muse S . PowerMarker: an integrated analysis environment for genetic marker analysis. Bioinformatics. 2005; 21(9):2128-9. DOI: 10.1093/bioinformatics/bti282. View