gene network inference Dotaz Zobrazit nápovědu
BACKGROUND: One possible approach how to economically facilitate gene expression profiling is to use the L1000 platform which measures the expression of ∼1,000 landmark genes and uses a computational method to infer the expression of another ∼10,000 genes. One such method for the gene expression inference is a D-GEX which employs neural networks. RESULTS: We propose two novel D-GEX architectures that significantly improve the quality of the inference by increasing the capacity of a network without any increase in the number of trained parameters. The architectures partition the network into individual towers. Our best proposed architecture - a checkerboard architecture with a skip connection and five towers - together with minor changes in the training protocol improves the average mean absolute error of the inference from 0.134 to 0.128. CONCLUSIONS: Our proposed approach increases the gene expression inference accuracy without increasing the number of weights of the model and thus without increasing the memory footprint of the model that is limiting its usage.
Gene expression profiling was made more cost-effective by the NIH LINCS program that profiles only ∼1, 000 selected landmark genes and uses them to reconstruct the whole profile. The D-GEX method employs neural networks to infer the entire profile. However, the original D-GEX can be significantly improved. We propose a novel transformative adaptive activation function that improves the gene expression inference even further and which generalizes several existing adaptive activation functions. Our improved neural network achieves an average mean absolute error of 0.1340, which is a significant improvement over our reimplementation of the original D-GEX, which achieves an average mean absolute error of 0.1637. The proposed transformative adaptive function enables a significantly more accurate reconstruction of the full gene expression profiles with only a small increase in the complexity of the model and its training procedure compared to other methods.
BACKGROUND: All currently available methods of network/association inference from microarray gene expression measurements implicitly assume that such measurements represent the actual expression levels of different genes within each cell included in the biological sample under study. Contrary to this common belief, modern microarray technology produces signals aggregated over a random number of individual cells, a "nitty-gritty" aspect of such arrays, thereby causing a random effect that distorts the correlation structure of intra-cellular gene expression levels. RESULTS: This paper provides a theoretical consideration of the random effect of signal aggregation and its implications for correlation analysis and network inference. An attempt is made to quantitatively assess the magnitude of this effect from real data. Some preliminary ideas are offered to mitigate the consequences of random signal aggregation in the analysis of gene expression data. CONCLUSION: Resulting from the summation of expression intensities over a random number of individual cells, the observed signals may not adequately reflect the true dependence structure of intra-cellular gene expression levels needed as a source of information for network reconstruction. Whether the reported effect is extrime or not, the important point, is to reconize and incorporate such signal source for proper inference. The usefulness of inference on genetic regulatory structures from microarray data depends critically on the ability of investigators to overcome this obstacle in a scientifically sound way. REVIEWERS: This article was reviewed by Byung Soo KIM, Jeanne Kowalski and Geoff McLachlan.
- MeSH
- lidé MeSH
- modely genetické MeSH
- neparametrická statistika MeSH
- sekvenční analýza hybridizací s uspořádaným souborem oligonukleotidů metody statistika a číselné údaje MeSH
- stanovení celkové genové exprese metody statistika a číselné údaje MeSH
- výpočetní biologie metody statistika a číselné údaje MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- zvířata MeSH
- Publikační typ
- práce podpořená grantem MeSH
- přehledy MeSH
- Research Support, N.I.H., Extramural MeSH
BACKGROUND: Inference of protein interaction networks from various sources of data has become an important topic of both systems and computational biology. Here we present a supervised approach to identification of gene expression regulatory networks. RESULTS: The method is based on a kernel approach accompanied with genetic programming. As a data source, the method utilizes gene expression time series for prediction of interactions among regulatory proteins and their target genes. The performance of the method was verified using Saccharomyces cerevisiae cell cycle and DNA/RNA/protein biosynthesis gene expression data. The results were compared with independent data sources. Finally, a prediction of novel interactions within yeast gene expression circuits has been performed. CONCLUSION: Results show that our algorithm gives, in most cases, results identical with the independent experiments, when compared with the YEASTRACT database. In several cases our algorithm gives predictions of novel interactions which have not been reported.
- MeSH
- algoritmy MeSH
- biologické modely MeSH
- financování organizované MeSH
- mapování interakce mezi proteiny metody MeSH
- počítačová simulace MeSH
- proteom metabolismus MeSH
- regulace genové exprese fyziologie MeSH
- rozpoznávání automatizované metody MeSH
- signální transdukce fyziologie MeSH
- umělá inteligence MeSH
Inference of gene expression networks has become one of the primary challenges in computational biology. Analysis of microarray experiments using appropriate mathematical models can reveal interactions among protein regulators and target genes. This paper presents a combined approach to the inference of gene expression networks from time series measurements, ChIP-on-chip experiments, analyses of promoter sequences, and protein-protein interaction data. A recursive model of gene expression allowing for identification of active gene expression control networks with up to two regulators of one target gene is presented. The model was used to inspect all possible regulator-target gene combinations and predict those that are active during the underlying biological process. The procedure was applied to the inference of part of a regulatory network of the S. cerevisiae cell cycle, formed by 37 target genes and 128 transcription factors. A set of the most probable networks was suggested and analyzed.
Cell cycle is controlled by the activity of protein family of cyclins and cyclin-dependent kinases that are periodically expressed during cell cycle and that are conserved among different species. Genome-wide location analysis found that cyclins are controlled by a small number of transcription factors that form closed network of genes controlling each other. To investigate gene expression dynamics of this network, we developed a general procedure for stochastic simulation of gene expression process. Using the binding data, we simulated gene expression of all genes of the network for all possible combinations of regulatory interactions and by statistical comparison with experimentally measured time series excluded those interactions that formed gene expression temporal profiles significantly different from the measured ones. These experiments led to a new definition of the cyclins regulatory network coherent with the binding experiments which are kinetically plausible. Level of influence of individual regulators in control of the regulated genes is defined. Simulation results indicate particular mechanism of regulatory activity of protein complexes involved in the control of cyclins.
Formation of a dorsoventral axis is a key event in the early development of most animal embryos. It is well established that bone morphogenetic proteins (Bmps) and Wnts are key mediators of dorsoventral patterning in vertebrates. In the cephalochordate amphioxus, genes encoding Bmps and transcription factors downstream of Bmp signaling such as Vent are expressed in patterns reminiscent of those of their vertebrate orthologues. However, the key question is whether the conservation of expression patterns of network constituents implies conservation of functional network interactions, and if so, how an increased functional complexity can evolve. Using heterologous systems, namely by reporter gene assays in mammalian cell lines and by transgenesis in medaka fish, we have compared the gene regulatory network implicated in dorsoventral patterning of the basal chordate amphioxus and vertebrates. We found that Bmp but not canonical Wnt signaling regulates promoters of genes encoding homeodomain proteins AmphiVent1 and AmphiVent2. Furthermore, AmphiVent1 and AmphiVent2 promoters appear to be correctly regulated in the context of a vertebrate embryo. Finally, we show that AmphiVent1 is able to directly repress promoters of AmphiGoosecoid and AmphiChordin genes. Repression of genes encoding dorsal-specific signaling molecule Chordin and transcription factor Goosecoid by Xenopus and zebrafish Vent genes represents a key regulatory interaction during vertebrate axis formation. Our data indicate high evolutionary conservation of a core Bmp-triggered gene regulatory network for dorsoventral patterning in chordates and suggest that co-option of the canonical Wnt signaling pathway for dorsoventral patterning in vertebrates represents one of the innovations through which an increased morphological complexity of vertebrate embryo is achieved.
- MeSH
- 5' nepřekládaná oblast MeSH
- Chordata genetika MeSH
- dánio pruhované embryologie genetika MeSH
- embryo nesavčí MeSH
- fylogeneze MeSH
- genetická variace genetika fyziologie MeSH
- genové regulační sítě MeSH
- homeodoménové proteiny genetika MeSH
- konzervovaná sekvence genetika MeSH
- kultivované buňky MeSH
- lidé MeSH
- molekulární evoluce MeSH
- molekulární sekvence - údaje MeSH
- Oryzias embryologie genetika MeSH
- protein goosecoid genetika MeSH
- rozvržení tělního plánu genetika MeSH
- sekvence aminokyselin MeSH
- sekvence nukleotidů MeSH
- sekvenční homologie aminokyselin MeSH
- vývojová regulace genové exprese MeSH
- Xenopus laevis embryologie genetika MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
A computational model of gene expression was applied to a novel test set of microarray time series measurements to reveal regulatory interactions between transcriptional regulators represented by 45 sigma factors and the genes expressed during germination of a prokaryote Streptomyces coelicolor. Using microarrays, the first 5.5 h of the process was recorded in 13 time points, which provided a database of gene expression time series on genome-wide scale. The computational modeling of the kinetic relations between the sigma factors, individual genes and genes clustered according to the similarity of their expression kinetics identified kinetically plausible sigma factor-controlled networks. Using genome sequence annotations, functional groups of genes that were predominantly controlled by specific sigma factors were identified. Using external binding data complementing the modeling approach, specific genes involved in the control of the studied process were identified and their function suggested.
- MeSH
- genetická transkripce MeSH
- genové regulační sítě * MeSH
- kinetika MeSH
- modely genetické * MeSH
- počítačová simulace MeSH
- regulace genové exprese u bakterií * MeSH
- sekvenční analýza hybridizací s uspořádaným souborem oligonukleotidů MeSH
- sigma faktor metabolismus MeSH
- spory bakteriální genetika růst a vývoj metabolismus MeSH
- stanovení celkové genové exprese * MeSH
- Streptomyces coelicolor genetika metabolismus fyziologie MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- MeSH
- genové regulační sítě * MeSH
- RNA * MeSH
- sekvenční analýza RNA MeSH
- Publikační typ
- časopisecké články MeSH
- komentáře MeSH
- práce podpořená grantem MeSH
BACKGROUND: Identifying regulons of sigma factors is a vital subtask of gene network inference. Integrating multiple sources of data is essential for correct identification of regulons and complete gene regulatory networks. Time series of expression data measured with microarrays or RNA-seq combined with static binding experiments (e.g., ChIP-seq) or literature mining may be used for inference of sigma factor regulatory networks. RESULTS: We introduce Genexpi: a tool to identify sigma factors by combining candidates obtained from ChIP experiments or literature mining with time-course gene expression data. While Genexpi can be used to infer other types of regulatory interactions, it was designed and validated on real biological data from bacterial regulons. In this paper, we put primary focus on CyGenexpi: a plugin integrating Genexpi with the Cytoscape software for ease of use. As a part of this effort, a plugin for handling time series data in Cytoscape called CyDataseries has been developed and made available. Genexpi is also available as a standalone command line tool and an R package. CONCLUSIONS: Genexpi is a useful part of gene network inference toolbox. It provides meaningful information about the composition of regulons and delivers biologically interpretable results.
- MeSH
- Bacteria genetika MeSH
- časové faktory MeSH
- databáze genetické * MeSH
- Eukaryota genetika MeSH
- genové regulační sítě * MeSH
- lidé MeSH
- regulace genové exprese * MeSH
- regulon genetika MeSH
- reprodukovatelnost výsledků MeSH
- Saccharomyces cerevisiae genetika MeSH
- software * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH