JavaScript NENÍ povolen !

Prosím povolte JavaScript.

Článek

PubMed

Záznam pochází z PubMed

Network-constrained forest for regularized classification of omics data

Anděl, Michael
Autor Anděl, Michael Department of Computer Science, Czech Technical University, Technická 2, Prague, Czech Republic. Electronic address: andelmi2@fel.cvut.cz
Kléma, Jiří
Autor Kléma, Jiří Department of Computer Science, Czech Technical University, Technická 2, Prague, Czech Republic. Electronic address: klema@fel.cvut.cz
Krejčík, Zdeněk
Autor Krejčík, Zdeněk Department of Molecular Genetics, Institute of Hematology and Blood Transfusion, U Nemocnice 1, Prague, Czech Republic. Electronic address: zdenek.krejcik@uhkt.cz

Methods (San Diego, Calif.). 2015 Jul 15 ; 83 () : 88-97. [epub] 20150411

Methods
ISSN 1095-9130 | 1046-2023
Zdroj

Jazyk angličtina Země Spojené státy americké Médium print-electronic

Typ dokumentu časopisecké články, práce podpořená grantem

Perzistentní odkaz https://www.medvik.cz/link/pmid25872185

PubMed 25872185
DOI 10.1016/j.ymeth.2015.04.006
PII: S1046-2023(15)00152-8
Knihovny.cz E-zdroje

Klíčová slova
Domain knowledge, Machine learning, Omics data, Random forest, Regularization, microRNA,
MeSH
genové regulační sítě MeSH
lidé MeSH
messenger RNA genetika MeSH
mikro RNA genetika MeSH
umělá inteligence MeSH
výpočetní biologie metody MeSH
Check Tag
lidé MeSH
Publikační typ
časopisecké články MeSH
práce podpořená grantem MeSH
Názvy látek
messenger RNA MeSH
mikro RNA MeSH

Contemporary molecular biology deals with wide and heterogeneous sets of measurements to model and understand underlying biological processes including complex diseases. Machine learning provides a frequent approach to build such models. However, the models built solely from measured data often suffer from overfitting, as the sample size is typically much smaller than the number of measured features. In this paper, we propose a random forest-based classifier that reduces this overfitting with the aid of prior knowledge in the form of a feature interaction network. We illustrate the proposed method in the task of disease classification based on measured mRNA and miRNA profiles complemented by the interaction network composed of the miRNA-mRNA target relations and mRNA-mRNA interactions corresponding to the interactions between their encoded proteins. We demonstrate that the proposed network-constrained forest employs prior knowledge to increase learning bias and consequently to improve classification accuracy, stability and comprehensibility of the resulting model. The experiments are carried out in the domain of myelodysplastic syndrome that we are concerned about in the long term. We validate our approach in the public domain of ovarian carcinoma, with the same data form. We believe that the idea of a network-constrained forest can straightforwardly be generalized towards arbitrary omics data with an available and non-trivial feature interaction network. The proposed method is publicly available in terms of miXGENE system (http://mixgene.felk.cvut.cz), the workflow that implements the myelodysplastic syndrome experiments is presented as a dedicated case study.

Department of Computer Science Czech Technical University Technická 2 Prague Czech Republic

Department of Molecular Genetics Institute of Hematology and Blood Transfusion U Nemocnice 1 Prague Czech Republic

Citace poskytuje Crossref.org

Nejnovějších 20 citací...

Zobrazit více v
Medvik | PubMed

Novel gene sets improve set-level classification of prokaryotic gene expression data

BMC bioinformatics. 2015 Oct 28 ; 16 () : 348. [epub] 20151028

BMC Bioinformatics
ISSN 1471-2105
Zdroj

Najít záznam

v BMČ

Network-constrained forest for regularized classification of omics data

Najít záznam

Citační ukazatele

Možnosti archivace