• Je něco špatně v tomto záznamu ?

A Robust Supervised Variable Selection for Noisy High-Dimensional Data

J. Kalina, A. Schlenker,

. 2015 ; 2015 (-) : 320385. [pub] 20150602

Jazyk angličtina Země Spojené státy americké

Typ dokumentu časopisecké články, práce podpořená grantem

Perzistentní odkaz   https://www.medvik.cz/link/bmc16010056

The Minimum Redundancy Maximum Relevance (MRMR) approach to supervised variable selection represents a successful methodology for dimensionality reduction, which is suitable for high-dimensional data observed in two or more different groups. Various available versions of the MRMR approach have been designed to search for variables with the largest relevance for a classification task while controlling for redundancy of the selected set of variables. However, usual relevance and redundancy criteria have the disadvantages of being too sensitive to the presence of outlying measurements and/or being inefficient. We propose a novel approach called Minimum Regularized Redundancy Maximum Robust Relevance (MRRMRR), suitable for noisy high-dimensional data observed in two groups. It combines principles of regularization and robust statistics. Particularly, redundancy is measured by a new regularized version of the coefficient of multiple correlation and relevance is measured by a highly robust correlation coefficient based on the least weighted squares regression with data-adaptive weights. We compare various dimensionality reduction methods on three real data sets. To investigate the influence of noise or outliers on the data, we perform the computations also for data artificially contaminated by severe noise of various forms. The experimental results confirm the robustness of the method with respect to outliers.

Citace poskytuje Crossref.org

000      
00000naa a2200000 a 4500
001      
bmc16010056
003      
CZ-PrNML
005      
20180531075640.0
007      
ta
008      
160408s2015 xxu f 000 0|eng||
009      
AR
024    7_
$a 10.1155/2015/320385 $2 doi
024    7_
$a 10.1155/2015/320385 $2 doi
035    __
$a (PubMed)26137474
040    __
$a ABA008 $b cze $d ABA008 $e AACR2
041    0_
$a eng
044    __
$a xxu
100    1_
$a Kalina, Jan $u Institute of Computer Science of the Czech Academy of Sciences, Pod Vodárenskou Vĕží 2, 182 07 Prague 8, Czech Republic.
245    12
$a A Robust Supervised Variable Selection for Noisy High-Dimensional Data / $c J. Kalina, A. Schlenker,
520    9_
$a The Minimum Redundancy Maximum Relevance (MRMR) approach to supervised variable selection represents a successful methodology for dimensionality reduction, which is suitable for high-dimensional data observed in two or more different groups. Various available versions of the MRMR approach have been designed to search for variables with the largest relevance for a classification task while controlling for redundancy of the selected set of variables. However, usual relevance and redundancy criteria have the disadvantages of being too sensitive to the presence of outlying measurements and/or being inefficient. We propose a novel approach called Minimum Regularized Redundancy Maximum Robust Relevance (MRRMRR), suitable for noisy high-dimensional data observed in two groups. It combines principles of regularization and robust statistics. Particularly, redundancy is measured by a new regularized version of the coefficient of multiple correlation and relevance is measured by a highly robust correlation coefficient based on the least weighted squares regression with data-adaptive weights. We compare various dimensionality reduction methods on three real data sets. To investigate the influence of noise or outliers on the data, we perform the computations also for data artificially contaminated by severe noise of various forms. The experimental results confirm the robustness of the method with respect to outliers.
650    12
$a algoritmy $7 D000465
650    12
$a interpretace statistických dat $7 D003627
650    _2
$a regulace genové exprese $x genetika $7 D005786
650    _2
$a lidé $7 D006801
650    _2
$a metabolomika $7 D055432
650    12
$a teoretické modely $7 D008962
650    12
$a proteomika $7 D040901
655    _2
$a časopisecké články $7 D016428
655    _2
$a práce podpořená grantem $7 D013485
700    1_
$a Schlenker, Anna, $d 1986- $7 xx0224780 $u Institute of Computer Science of the Czech Academy of Sciences, Pod Vodárenskou Vĕží 2, 182 07 Prague 8, Czech Republic ; Department of Biomedical Informatics, Faculty of Biomedical Engineering, Czech Technical University in Prague, Náměstí Sítná 3105, 272 01 Kladno, Czech Republic.
773    0_
$w MED00182164 $t BioMed research international $x 2314-6141 $g Roč. 2015, č. - (2015), s. 320385
856    41
$u https://pubmed.ncbi.nlm.nih.gov/26137474 $y Pubmed
910    __
$a ABA008 $b sig $c sign $y a $z 0
990    __
$a 20160408 $b ABA008
991    __
$a 20180531075836 $b ABA008
999    __
$a ok $b bmc $g 1113485 $s 934424
BAS    __
$a 3
BAS    __
$a PreBMC
BMC    __
$a 2015 $b 2015 $c - $d 320385 $e 20150602 $i 2314-6141 $m BioMed research international $n Biomed Res Int $x MED00182164
LZP    __
$a Pubmed-20160408

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...