A multivariate extension of the gene set enrichment analysis
Language English Country Singapore Media print
Document type Journal Article, Research Support, N.I.H., Extramural, Research Support, Non-U.S. Gov't
Grant support
K99LM009477
NLM NIH HHS - United States
R01 GM075299
NIGMS NIH HHS - United States
R21 GM079259
NIGMS NIH HHS - United States
PubMed
17933015
DOI
10.1142/s0219720007003041
PII: S0219720007003041
Knihovny.cz E-resources
- MeSH
- Phenotype MeSH
- Models, Genetic MeSH
- Multivariate Analysis MeSH
- Oligonucleotide Array Sequence Analysis statistics & numerical data MeSH
- Gene Expression Profiling statistics & numerical data MeSH
- Computational Biology MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Research Support, N.I.H., Extramural MeSH
A test-statistic typically employed in the gene set enrichment analysis (GSEA) prevents this method from being genuinely multivariate. In particular, this statistic is insensitive to changes in the correlation structure of the gene sets of interest. The present paper considers the utility of an alternative test-statistic in designing the confirmatory component of the GSEA. This statistic is based on a pertinent distance between joint distributions of expression levels of genes included in the set of interest. The null distribution of the proposed test-statistic, known as the multivariate N-statistic, is obtained by permuting group labels. Our simulation studies and analysis of biological data confirm the conjecture that the N-statistic is a much better choice for multivariate significance testing within the framework of the GSEA. We also discuss some other aspects of the GSEA paradigm and suggest new avenues for future research.
References provided by Crossref.org