• Je něco špatně v tomto záznamu ?

Wrapper feature selection for small sample size data driven by complete error estimates

M. Macaš, L. Lhotská, E. Bakstein, D. Novák, J. Wild, T. Sieger, P. Vostatek, R. Jech,

. 2012 ; 108 (1) : 138-50.

Jazyk angličtina Země Irsko

Typ dokumentu časopisecké články, práce podpořená grantem, validační studie

Perzistentní odkaz   https://www.medvik.cz/link/bmc13012722

This paper focuses on wrapper-based feature selection for a 1-nearest neighbor classifier. We consider in particular the case of a small sample size with a few hundred instances, which is common in biomedical applications. We propose a technique for calculating the complete bootstrap for a 1-nearest-neighbor classifier (i.e., averaging over all desired test/train partitions of the data). The complete bootstrap and the complete cross-validation error estimate with lower variance are applied as novel selection criteria and are compared with the standard bootstrap and cross-validation in combination with three optimization techniques - sequential forward selection (SFS), binary particle swarm optimization (BPSO) and simplified social impact theory based optimization (SSITO). The experimental comparison based on ten datasets draws the following conclusions: for all three search methods examined here, the complete criteria are a significantly better choice than standard 2-fold cross-validation, 10-fold cross-validation and bootstrap with 50 trials irrespective of the selected output number of iterations. All the complete criterion-based 1NN wrappers with SFS search performed better than the widely-used FILTER and SIMBA methods. We also demonstrate the benefits and properties of our approaches on an important and novel real-world application of automatic detection of the subthalamic nucleus.

Citace poskytuje Crossref.org

000      
00000naa a2200000 a 4500
001      
bmc13012722
003      
CZ-PrNML
005      
20130410102036.0
007      
ta
008      
130404s2012 ie f 000 0|eng||
009      
AR
024    7_
$a 10.1016/j.cmpb.2012.02.006 $2 doi
035    __
$a (PubMed)22472029
040    __
$a ABA008 $b cze $d ABA008 $e AACR2
041    0_
$a eng
044    __
$a ie
100    1_
$a Macaš, Martin $u Czech Technical University, Faculty of Electrical Engineering, Department of Cybernetics, Karlovo Namesti 13, 12135 Prague, Czech Republic.
245    10
$a Wrapper feature selection for small sample size data driven by complete error estimates / $c M. Macaš, L. Lhotská, E. Bakstein, D. Novák, J. Wild, T. Sieger, P. Vostatek, R. Jech,
520    9_
$a This paper focuses on wrapper-based feature selection for a 1-nearest neighbor classifier. We consider in particular the case of a small sample size with a few hundred instances, which is common in biomedical applications. We propose a technique for calculating the complete bootstrap for a 1-nearest-neighbor classifier (i.e., averaging over all desired test/train partitions of the data). The complete bootstrap and the complete cross-validation error estimate with lower variance are applied as novel selection criteria and are compared with the standard bootstrap and cross-validation in combination with three optimization techniques - sequential forward selection (SFS), binary particle swarm optimization (BPSO) and simplified social impact theory based optimization (SSITO). The experimental comparison based on ten datasets draws the following conclusions: for all three search methods examined here, the complete criteria are a significantly better choice than standard 2-fold cross-validation, 10-fold cross-validation and bootstrap with 50 trials irrespective of the selected output number of iterations. All the complete criterion-based 1NN wrappers with SFS search performed better than the widely-used FILTER and SIMBA methods. We also demonstrate the benefits and properties of our approaches on an important and novel real-world application of automatic detection of the subthalamic nucleus.
650    _2
$a teoretické modely $7 D008962
650    12
$a velikost vzorku $7 D018401
655    _2
$a časopisecké články $7 D016428
655    _2
$a práce podpořená grantem $7 D013485
655    _2
$a validační studie $7 D023361
700    1_
$a Lhotská, Lenka $u -
700    1_
$a Bakstein, Eduard $u -
700    1_
$a Novák, Daniel $u -
700    1_
$a Wild, Jiří $u -
700    1_
$a Sieger, Tomáš $u -
700    1_
$a Vostatek, Pavel $u -
700    1_
$a Jech, Robert $u -
773    0_
$w MED00001214 $t Computer methods and programs in biomedicine $x 1872-7565 $g Roč. 108, č. 1 (2012), s. 138-50
856    41
$u https://pubmed.ncbi.nlm.nih.gov/22472029 $y Pubmed
910    __
$a ABA008 $b sig $c sign $y a $z 0
990    __
$a 20130404 $b ABA008
991    __
$a 20130410102305 $b ABA008
999    __
$a ok $b bmc $g 975920 $s 811003
BAS    __
$a 3
BAS    __
$a PreBMC
BMC    __
$a 2012 $b 108 $c 1 $d 138-50 $i 1872-7565 $m Computer methods and programs in biomedicine $n Comput Methods Programs Biomed $x MED00001214
LZP    __
$a Pubmed-20130404

Najít záznam

Citační ukazatele

Nahrávání dat ...

    Možnosti archivace