Detail
Článek
Článek online
FT
Medvik - BMČ
  • Je něco špatně v tomto záznamu ?

Combining Shapley value and statistics to the analysis of gene expression data in children exposed to air pollution

S. Moretti, D. van Leeuwen, H. Gmuender, S. Bonassi, J. van Delft, J. Kleinjans, F. Patrone, DF. Merlo,

. 2008 ; 9 (-) : 361. [pub] 20080902

Jazyk angličtina Země Velká Británie

Typ dokumentu hodnotící studie, časopisecké články, práce podpořená grantem

Perzistentní odkaz   https://www.medvik.cz/link/bmc20014384

BACKGROUND: In gene expression analysis, statistical tests for differential gene expression provide lists of candidate genes having, individually, a sufficiently low p-value. However, the interpretation of each single p-value within complex systems involving several interacting genes is problematic. In parallel, in the last sixty years, game theory has been applied to political and social problems to assess the power of interacting agents in forcing a decision and, more recently, to represent the relevance of genes in response to certain conditions. RESULTS: In this paper we introduce a Bootstrap procedure to test the null hypothesis that each gene has the same relevance between two conditions, where the relevance is represented by the Shapley value of a particular coalitional game defined on a microarray data-set. This method, which is called Comparative Analysis of Shapley value (shortly, CASh), is applied to data concerning the gene expression in children differentially exposed to air pollution. The results provided by CASh are compared with the results from a parametric statistical test for testing differential gene expression. Both lists of genes provided by CASh and t-test are informative enough to discriminate exposed subjects on the basis of their gene expression profiles. While many genes are selected in common by CASh and the parametric test, it turns out that the biological interpretation of the differences between these two selections is more interesting, suggesting a different interpretation of the main biological pathways in gene expression regulation for exposed individuals. A simulation study suggests that CASh offers more power than t-test for the detection of differential gene expression variability. CONCLUSION: CASh is successfully applied to gene expression analysis of a data-set where the joint expression behavior of genes may be critical to characterize the expression response to air pollution. We demonstrate a synergistic effect between coalitional games and statistics that resulted in a selection of genes with a potential impact in the regulation of complex pathways.

Citace poskytuje Crossref.org

000      
00000naa a2200000 a 4500
001      
bmc20014384
003      
CZ-PrNML
005      
20200921152837.0
007      
ta
008      
200918s2008 xxk f 000 0|eng||
009      
AR
024    7_
$a 10.1186/1471-2105-9-361 $2 doi
035    __
$a (PubMed)18764936
040    __
$a ABA008 $b cze $d ABA008 $e AACR2
041    0_
$a eng
044    __
$a xxk
100    1_
$a Moretti, Stefano $u Epidemiology and Biostatistics, National Cancer Research Institute, Genova, Italy. stefano.moretti@istge.it
245    10
$a Combining Shapley value and statistics to the analysis of gene expression data in children exposed to air pollution / $c S. Moretti, D. van Leeuwen, H. Gmuender, S. Bonassi, J. van Delft, J. Kleinjans, F. Patrone, DF. Merlo,
520    9_
$a BACKGROUND: In gene expression analysis, statistical tests for differential gene expression provide lists of candidate genes having, individually, a sufficiently low p-value. However, the interpretation of each single p-value within complex systems involving several interacting genes is problematic. In parallel, in the last sixty years, game theory has been applied to political and social problems to assess the power of interacting agents in forcing a decision and, more recently, to represent the relevance of genes in response to certain conditions. RESULTS: In this paper we introduce a Bootstrap procedure to test the null hypothesis that each gene has the same relevance between two conditions, where the relevance is represented by the Shapley value of a particular coalitional game defined on a microarray data-set. This method, which is called Comparative Analysis of Shapley value (shortly, CASh), is applied to data concerning the gene expression in children differentially exposed to air pollution. The results provided by CASh are compared with the results from a parametric statistical test for testing differential gene expression. Both lists of genes provided by CASh and t-test are informative enough to discriminate exposed subjects on the basis of their gene expression profiles. While many genes are selected in common by CASh and the parametric test, it turns out that the biological interpretation of the differences between these two selections is more interesting, suggesting a different interpretation of the main biological pathways in gene expression regulation for exposed individuals. A simulation study suggests that CASh offers more power than t-test for the detection of differential gene expression variability. CONCLUSION: CASh is successfully applied to gene expression analysis of a data-set where the joint expression behavior of genes may be critical to characterize the expression response to air pollution. We demonstrate a synergistic effect between coalitional games and statistics that resulted in a selection of genes with a potential impact in the regulation of complex pathways.
650    _2
$a znečištění ovzduší $x statistika a číselné údaje $7 D000397
650    _2
$a algoritmy $7 D000465
650    _2
$a biologické markery $x analýza $7 D015415
650    _2
$a dítě $7 D002648
650    _2
$a počítačová simulace $7 D003198
650    12
$a interpretace statistických dat $7 D003627
650    _2
$a vystavení vlivu životního prostředí $x analýza $x statistika a číselné údaje $7 D004781
650    _2
$a epidemiologické metody $7 D004812
650    _2
$a stanovení celkové genové exprese $x metody $x statistika a číselné údaje $7 D020869
650    _2
$a lidé $7 D006801
650    _2
$a biologické modely $7 D008954
650    _2
$a statistické modely $7 D015233
650    _2
$a proteom $x analýza $7 D020543
650    _2
$a hodnocení rizik $x metody $7 D018570
650    _2
$a rizikové faktory $7 D012307
651    _2
$a Česká republika $x epidemiologie $7 D018153
655    _2
$a hodnotící studie $7 D023362
655    _2
$a časopisecké články $7 D016428
655    _2
$a práce podpořená grantem $7 D013485
700    1_
$a van Leeuwen, Danitsja
700    1_
$a Gmuender, Hans
700    1_
$a Bonassi, Stefano
700    1_
$a van Delft, Joost
700    1_
$a Kleinjans, Jos
700    1_
$a Patrone, Fioravante
700    1_
$a Merlo, Domenico Franco
773    0_
$w MED00008167 $t BMC bioinformatics $x 1471-2105 $g Roč. 9, č. - (2008), s. 361
856    41
$u https://pubmed.ncbi.nlm.nih.gov/18764936 $y Pubmed
910    __
$a ABA008 $b sig $c sign $y a $z 0
990    __
$a 20200918 $b ABA008
991    __
$a 20200921152836 $b ABA008
999    __
$a ok $b bmc $g 1565236 $s 1104542
BAS    __
$a 3
BAS    __
$a PreBMC
BMC    __
$a 2008 $b 9 $c - $d 361 $e 20080902 $i 1471-2105 $m BMC bioinformatics $n BMC Bioinformatics $x MED00008167
LZP    __
$a Pubmed-20200918

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...