• Something wrong with this record ?

Comparing assignment-based approaches to breed identification within a large set of horses

L. Putnová, R. Štohl,

. 2019 ; 60 (2) : 187-198. [pub] 20190408

Language English Country England, Great Britain

Document type Journal Article

Grant support
QH92277 Národní Agentura pro Zemědělsk Vzkum
LO1210 Ministerstvo Školství, Mládeže a Tělovýchovy
2108 Mendelova Univerzita v Brně

Considering the extensive data sets and statistical techniques, animal breeding embodies a branch of machine learning that has a constantly increasing impact on breeding. In our study, information regarding the potential of machine learning and data mining within a large set of horses and breeds is presented. The individual assignment methods and factors influencing the success rate of the procedure are compared at the Czech population scale. The fixation index values ranged from 0.057 (HMS1) to 0.144 (HTG6), and the overall genetic differentiation amounted to 8.9% among the breeds. The highest genetic divergence (FST = 0.378) was established between the Friesian and Equus przewalskii; the highest degree of gene migration was obtained between the Czech and Bavarian Warmblood (Nm = 14,302); and the overall global heterozygote deficit across the populations was 10.4%. The eight standard methods (Bayesian, frequency, and distance) using GeneClass software and almost all mainstream classification algorithms (Bayes Net, Naive Bayes, IB1, IB5, KStar, JRip, J48, Random Forest, Random Tree, PART, MLP, and SVM) from the WEKA machine learning workbench were compared by utilizing 314,874 real allelic data sets. The Bayesian method (GeneClass, 89.9%) and Bayesian network algorithm (WEKA, 84.8%) outperformed the other techniques. The breed genomic prediction accuracy reached the highest value in the cold-blooded horses. The overall proportion of individuals correctly assigned to a population depended mainly on the breed number and genetic divergence. These statistical tools could be used to assess breed traceability systems, and they exhibit the potential to assist managers in decision-making as regards breeding and registration.

References provided by Crossref.org

000      
00000naa a2200000 a 4500
001      
bmc19027668
003      
CZ-PrNML
005      
20190823101113.0
007      
ta
008      
190813s2019 enk f 000 0|eng||
009      
AR
024    7_
$a 10.1007/s13353-019-00495-x $2 doi
035    __
$a (PubMed)30963515
040    __
$a ABA008 $b cze $d ABA008 $e AACR2
041    0_
$a eng
044    __
$a enk
100    1_
$a Putnová, Lenka $u Laboratory of Agrogenomics, Department of Morphology, Physiology and Animal Genetics, Faculty of Agronomy, Mendel University in Brno, Zemědělská 1665/1, 613 00, Brno, Czech Republic. putnova@email.cz.
245    10
$a Comparing assignment-based approaches to breed identification within a large set of horses / $c L. Putnová, R. Štohl,
520    9_
$a Considering the extensive data sets and statistical techniques, animal breeding embodies a branch of machine learning that has a constantly increasing impact on breeding. In our study, information regarding the potential of machine learning and data mining within a large set of horses and breeds is presented. The individual assignment methods and factors influencing the success rate of the procedure are compared at the Czech population scale. The fixation index values ranged from 0.057 (HMS1) to 0.144 (HTG6), and the overall genetic differentiation amounted to 8.9% among the breeds. The highest genetic divergence (FST = 0.378) was established between the Friesian and Equus przewalskii; the highest degree of gene migration was obtained between the Czech and Bavarian Warmblood (Nm = 14,302); and the overall global heterozygote deficit across the populations was 10.4%. The eight standard methods (Bayesian, frequency, and distance) using GeneClass software and almost all mainstream classification algorithms (Bayes Net, Naive Bayes, IB1, IB5, KStar, JRip, J48, Random Forest, Random Tree, PART, MLP, and SVM) from the WEKA machine learning workbench were compared by utilizing 314,874 real allelic data sets. The Bayesian method (GeneClass, 89.9%) and Bayesian network algorithm (WEKA, 84.8%) outperformed the other techniques. The breed genomic prediction accuracy reached the highest value in the cold-blooded horses. The overall proportion of individuals correctly assigned to a population depended mainly on the breed number and genetic divergence. These statistical tools could be used to assess breed traceability systems, and they exhibit the potential to assist managers in decision-making as regards breeding and registration.
650    _2
$a algoritmy $7 D000465
650    _2
$a alely $7 D000483
650    _2
$a zvířata $7 D000818
650    12
$a chov $7 D001947
650    _2
$a frekvence genu $7 D005787
650    _2
$a genetická variace $7 D014644
650    12
$a genomika $7 D023281
650    _2
$a genotyp $7 D005838
650    _2
$a heterozygot $7 D006579
650    _2
$a koně $x klasifikace $x genetika $7 D006736
650    _2
$a mikrosatelitní repetice $x genetika $7 D018895
650    _2
$a software $7 D012984
650    _2
$a druhová specificita $7 D013045
655    _2
$a časopisecké články $7 D016428
700    1_
$a Štohl, Radek $u Department of Control and Instrumentation, Faculty of Electrical Engineering and Communication, Brno University of Technology, Technická 3082/12, 616 00, Brno, Czech Republic.
773    0_
$w MED00002521 $t Journal of applied genetics $x 2190-3883 $g Roč. 60, č. 2 (2019), s. 187-198
856    41
$u https://pubmed.ncbi.nlm.nih.gov/30963515 $y Pubmed
910    __
$a ABA008 $b sig $c sign $y a $z 0
990    __
$a 20190813 $b ABA008
991    __
$a 20190823101327 $b ABA008
999    __
$a ok $b bmc $g 1432817 $s 1066128
BAS    __
$a 3
BAS    __
$a PreBMC
BMC    __
$a 2019 $b 60 $c 2 $d 187-198 $e 20190408 $i 2190-3883 $m Journal of Applied Genetics $n J Appl Genet $x MED00002521
GRA    __
$a QH92277 $p Národní Agentura pro Zemědělsk Vzkum
GRA    __
$a LO1210 $p Ministerstvo Školství, Mládeže a Tělovýchovy
GRA    __
$a 2108 $p Mendelova Univerzita v Brně
LZP    __
$a Pubmed-20190813

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...