JavaScript NENÍ povolen !

Prosím povolte JavaScript.

* Zobrazit nápovědu

Reset

Autor: Železný, Filip

9 záznamů v Medvik Filtry

Článek

Semantic biclustering for finding local, interpretable and predictive expression patterns

Kléma, Jiří
Autor Kléma, Jiří Department of Computer Science, Czech Technical University in Prague, Karlovo náměstí 13, 121 35, Prague 2, Czech Republic. klema@fel.cvut.cz
Malinka, František
Autor Malinka, František Department of Computer Science, Czech Technical University in Prague, Karlovo náměstí 13, 121 35, Prague 2, Czech Republic
Železný, Filip
Autor Železný, Filip Department of Computer Science, Czech Technical University in Prague, Karlovo náměstí 13, 121 35, Prague 2, Czech Republic

BMC genomics. 2017 ; 18 (Suppl 7) : 752. [pub] 20171016

BMC Genomics
ISSN 1471-2164
Medvik
Zdroj

BACKGROUND: One of the major challenges in the analysis of gene expression data is to identify local patterns composed of genes showing coherent expression across subsets of experimental conditions. Such patterns may provide an understanding of underlying biological processes related to these conditions. This understanding can further be improved by providing concise characterizations of the genes and situations delimiting the pattern. RESULTS: We propose a method called semantic biclustering with the aim to detect interpretable rectangular patterns in binary data matrices. As usual in biclustering, we seek homogeneous submatrices, however, we also require that the included elements can be jointly described in terms of semantic annotations pertaining to both rows (genes) and columns (samples). To find such interpretable biclusters, we explore two strategies. The first endows an existing biclustering algorithm with the semantic ingredients. The other is based on rule and tree learning known from machine learning. CONCLUSIONS: The two alternatives are tested in experiments with two Drosophila melanogaster gene expression datasets. Both strategies are shown to detect sets of compact biclusters with semantic descriptions that also remain largely valid for unseen (testing) data. This desirable generalization aspect is more emphasized in the strategy stemming from conventional biclustering although this is traded off by the complexity of the descriptions (number of ontology terms employed), which, on the other hand, is lower for the alternative strategy.

Článek

Novel gene sets improve set-level classification of prokaryotic gene expression data

BMC bioinformatics. 2015 ; 16 (-) : 348. [pub] 20151028

BMC Bioinformatics
ISSN 1471-2105
Medvik
Zdroj

BACKGROUND: Set-level classification of gene expression data has received significant attention recently. In this setting, high-dimensional vectors of features corresponding to genes are converted into lower-dimensional vectors of features corresponding to biologically interpretable gene sets. The dimensionality reduction brings the promise of a decreased risk of overfitting, potentially resulting in improved accuracy of the learned classifiers. However, recent empirical research has not confirmed this expectation. Here we hypothesize that the reported unfavorable classification results in the set-level framework were due to the adoption of unsuitable gene sets defined typically on the basis of the Gene ontology and the KEGG database of metabolic networks. We explore an alternative approach to defining gene sets, based on regulatory interactions, which we expect to collect genes with more correlated expression. We hypothesize that such more correlated gene sets will enable to learn more accurate classifiers. METHODS: We define two families of gene sets using information on regulatory interactions, and evaluate them on phenotype-classification tasks using public prokaryotic gene expression data sets. From each of the two gene-set families, we first select the best-performing subtype. The two selected subtypes are then evaluated on independent (testing) data sets against state-of-the-art gene sets and against the conventional gene-level approach. RESULTS: The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. CONCLUSION: Novel gene sets defined on the basis of regulatory interactions improve set-level classification of gene expression data. The experimental scripts and other material needed to reproduce the experiments are available at http://ida.felk.cvut.cz/novelgenesets.tar.gz.

MeSH
genová ontologie MeSH
metabolické sítě a dráhy genetika MeSH
operon genetika MeSH
prokaryotické buňky metabolismus MeSH
regulace genové exprese * MeSH
strojové učení MeSH
transkripční faktory genetika metabolismus MeSH
Publikační typ
časopisecké články MeSH
práce podpořená grantem MeSH

Článek

Tubular atrophy and low netrin-1 gene expression are associated with delayed kidney allograft function

Transplantation. 2014 ; 97 (2) : 176-183.

ISSN 1534-6080
Medvik
Zdroj

BACKGROUND: Delayed graft function (DGF) caused by ischemia/reperfusion injury (I/RI) negatively influences the outcome of kidney transplantation. This prospective single-center study characterized the intrarenal transcriptome during I/RI as a means of identifying genes associated with DGF development. METHODS: Characterization of the intrarenal transcription profile associated with I/RI was carried out on three sequential graft biopsies from respective allografts before and during transplantation. The intragraft expression of 92 candidate genes was measured using quantitative real-time reverse transcriptase polymerase chain reaction (2) in delayed (n=9) and primary function allografts (n=26). RESULTS: Cold storage was not associated with significant changes to the expression profile of the target gene transcripts; however, up-regulation of 16 genes associated with enhanced activation of innate and adaptive immune responses and apoptosis was observed after reperfusion. Multivariate logistic regression analysis revealed that higher tubular atrophy scores (ct) together with a lower expression of Netrin-1 might predict DGF development (training area under the receiver operating curve=0.89, cross-validated area under the receiver operating curve=0.81). CONCLUSIONS: Poor baseline tubular cell quality (defined by a higher rate of tubular atrophy) combined with the reduced potential of apoptotic survival factors represented by decreased Netrin-1 gene expression were associated with delayed kidney graft function.

MeSH
analýza hlavních komponent MeSH
atrofie MeSH
biopsie MeSH
imunohistochemie MeSH
ledvinové kanálky patologie MeSH
lidé MeSH
logistické modely MeSH
nádorové supresorové proteiny analýza genetika MeSH
neurotrofní faktory analýza genetika MeSH
opožděný nástup funkce štěpu etiologie metabolismus patologie MeSH
prospektivní studie MeSH
regulace genové exprese MeSH
reperfuzní poškození komplikace MeSH
transplantace ledvin škodlivé účinky MeSH
Check Tag
lidé MeSH
Publikační typ
časopisecké články MeSH
práce podpořená grantem MeSH

Abstrakt

Molekulární prediktory opožděného rozvoje funkce štěpu

Aktuality v nefrologii. 2012 ; 18 (Suppl. 1) : 19.

Aktual. nefrol. (Print)
ISSN 1210-955X
Medvik
Zdroj

Český nefrologický kongres s mezinárodní účastí. 2012 ; 18 (Suppl. 1) : 19.

Medvik
Zdroj

Publikační typ
abstrakt z konference MeSH

Článek

Differential regulation of the nuclear factor-κB pathway by rabbit antithymocyte globulins in kidney transplantation

Transplantation. 2012 ; 93 (6) : 589-596.

ISSN 1534-6080
Medvik
Zdroj

BACKGROUND: Induction therapy is associated with excellent short-term kidney graft outcome. The aim of this study was to evaluate differences in the intragraft transcriptome after successful induction therapy using two rabbit antithymocyte globulins. METHODS: The expression of 376 target genes involved in tolerance, inflammation, T- and B-cell immune response, and apoptosis was evaluated using the quantitative real-time reverse-transcriptase polymerase chain reaction (2(-ΔΔCt)) method in kidney graft biopsies with normal histological findings and stable renal function, 3 months posttransplantation after induction therapy with Thymoglobulin, ATG-Fresenius S (ATG-F), and a control group without induction therapy. RESULTS: The transcriptional pattern induced by Thymoglobulin differed from ATG-F in 18 differentially expressed genes. Down-regulation of genes involved in the nuclear factor-κB pathway (TLR4, MYD88, and CD209), costimulation (CD80 and CTLA4), apoptosis (NLRP1), chemoattraction (CCR10), and dendritic cell function (CLEC4C) was observed in the biopsies from patients treated with Thymoglobulin. A hierarchical clustering analysis clearly separated the Thymoglobulin group from the ATG-F group, while the control group had a similar profile as the Thymoglobulin group. CONCLUSIONS: Despite normal morphology in graft biopsy taken 3 months posttransplantation, the intrarenal transcriptome differed in patients treated with induction therapy using different rATGs. In the Thymoglobulin high-risk group, the transcriptome profile was identical to the low-risk group. Therefore, the down-regulation of the nuclear factor-κB pathway after Thymoglobulin induction in vivo is likely to explain the clinical success of this biologic.

MeSH
antilymfocytární sérum farmakologie MeSH
apoptóza MeSH
biopsie MeSH
dospělí MeSH
down regulace účinky léků imunologie fyziologie MeSH
imunosupresivní léčba metody MeSH
králíci MeSH
ledviny metabolismus patologie MeSH
lidé středního věku MeSH
lidé MeSH
messenger RNA metabolismus MeSH
následné studie MeSH
NF-kappa B genetika metabolismus MeSH
rejekce štěpu imunologie patologie prevence a kontrola MeSH
senioři MeSH
signální transdukce účinky léků imunologie fyziologie MeSH
stanovení celkové genové exprese MeSH
transplantace ledvin imunologie patologie fyziologie MeSH
zvířata MeSH
Check Tag
dospělí MeSH
králíci MeSH
lidé středního věku MeSH
lidé MeSH
mužské pohlaví MeSH
senioři MeSH
ženské pohlaví MeSH
zvířata MeSH
Publikační typ
časopisecké články MeSH
práce podpořená grantem MeSH
srovnávací studie MeSH

Článek

Prediction of DNA-binding propensity of proteins by the ball-histogram method using automatic template search

BMC bioinformatics. 2012 ; 13 Suppl 10 () : S3.

BMC Bioinformatics
ISSN 1471-2105
Medvik
Zdroj

We contribute a novel, ball-histogram approach to DNA-binding propensity prediction of proteins. Unlike state-of-the-art methods based on constructing an ad-hoc set of features describing physicochemical properties of the proteins, the ball-histogram technique enables a systematic, Monte-Carlo exploration of the spatial distribution of amino acids complying with automatically selected properties. This exploration yields a model for the prediction of DNA binding propensity. We validate our method in prediction experiments, improving on state-of-the-art accuracies. Moreover, our method also provides interpretable features involving spatial distributions of selected amino acids.

Článek

Comparative evaluation of set-level techniques in predictive classification of gene expression samples

BMC bioinformatics. 2012 ; 13 Suppl 10 () : S15.

BMC Bioinformatics
ISSN 1471-2105
Medvik
Zdroj

BACKGROUND: Analysis of gene expression data in terms of a priori-defined gene sets has recently received significant attention as this approach typically yields more compact and interpretable results than those produced by traditional methods that rely on individual genes. The set-level strategy can also be adopted with similar benefits in predictive classification tasks accomplished with machine learning algorithms. Initial studies into the predictive performance of set-level classifiers have yielded rather controversial results. The goal of this study is to provide a more conclusive evaluation by testing various components of the set-level framework within a large collection of machine learning experiments. RESULTS: Genuine curated gene sets constitute better features for classification than sets assembled without biological relevance. For identifying the best gene sets for classification, the Global test outperforms the gene-set methods GSEA and SAM-GS as well as two generic feature selection methods. To aggregate expressions of genes into a feature value, the singular value decomposition (SVD) method as well as the SetSig technique improve on simple arithmetic averaging. Set-level classifiers learned with 10 features constituted by the Global test slightly outperform baseline gene-level classifiers learned with all original data features although they are slightly less accurate than gene-level classifiers learned with a prior feature-selection step. CONCLUSION: Set-level classifiers do not boost predictive accuracy, however, they do achieve competitive accuracy if learned with the right combination of ingredients. AVAILABILITY: Open-source, publicly available software was used for classifier learning and testing. The gene expression datasets and the gene set database used are also publicly available. The full tabulation of experimental results is available at http://ida.felk.cvut.cz/CESLT.

MeSH
algoritmy * MeSH
Bayesova věta MeSH
rozhodovací stromy MeSH
stanovení celkové genové exprese metody MeSH
support vector machine MeSH
umělá inteligence * MeSH
výpočetní biologie metody MeSH
Publikační typ
časopisecké články MeSH
práce podpořená grantem MeSH

Článek

Prediction of DNA-binding proteins from relational features

Proteome science. 2012 ; 10 (1) : 66.

Proteome Sci
ISSN 1477-5956
Medvik
Zdroj

UNLABELLED: BACKGROUND: The process of protein-DNA binding has an essential role in the biological processing of genetic information. We use relational machine learning to predict DNA-binding propensity of proteins from their structures. Automatically discovered structural features are able to capture some characteristic spatial configurations of amino acids in proteins. RESULTS: Prediction based only on structural relational features already achieves competitive results to existing methods based on physicochemical properties on several protein datasets. Predictive performance is further improved when structural features are combined with physicochemical features. Moreover, the structural features provide some insights not revealed by physicochemical features. Our method is able to detect common spatial substructures. We demonstrate this in experiments with zinc finger proteins. CONCLUSIONS: We introduced a novel approach for DNA-binding propensity prediction using relational machine learning which could potentially be used also for protein function prediction in general.

Publikační typ
časopisecké články MeSH

Článek

Dolování silných vzorů z lékařských sekvenčních dat
[Mining the strongest patterns in medical sequential data]

Lékař a technika. 2008 ; 38 (4) : 87-94.

ISSN 0301-5491
Medvik
Zdroj

Sekvenční data jsou důležitým zdrojem lékařských znalostí. Tato specifická data mohou vznikat řadou různých způsobů. V tomto článku na příkladu konkrétní studie prezentujeme obecné postupy pro jejich dolování. Jde o preventivní dlouhodobou studii atherosklerózy – data jsou výsledkem dvě dekády trvajícího sledování vývoje rizikových faktorů a přidružených jevů. Hlavním cílem je identifikovat časté sekvenční vzory, tj. opakující se časové jevy, a studovat jejich možnou souvislost s objevením jedné ze sledovaných kardiovaskulárních nemocí. Z širší škály dostupných metod se soustředíme na induktivní logické programování, které potenciální vzory vyjadřuje ve formě rysů v predikátové logice prvního řádu. Rysy jsou nejprve automaticky extrahovány a následně sdružovány do pravidel, která představují výstupní formu získané znalosti. Navržený postup je porovnán s tradičnějšími metodami publikovanými dříve. Jde o metodu posuvných oken a epizodní pravidla.

Sequential data represent an important source of automatically mined and potentially new medical knowledge. They can originate in various ways. Within the presented domain they come from a longitudinal preventive study of atherosclerosis - the data consists of series of long-term observations recording the development of risk factors and associated conditions. The intention is to identify frequent sequential patterns having any relation to an onset of any of the observed cardiovascular diseases. This paper focuses on application of inductive logic programming. The prospective patterns are based on first-order features automatically extracted from the sequential data. The features are further grouped in order to reach final complex patterns expressed as rules. The presented approach is also compared with the approaches published earlier (windowing, episode rules).

Kolekce

Publikováno

Filtry

* Zobrazit nápovědu

* Zobrazit nápovědu

Upřesnit dle MeSH