Augusta: From RNA-Seq to gene regulatory networks and Boolean models

. 2024 Dec ; 23 () : 783-790. [epub] 20240120

Status PubMed-not-MEDLINE Jazyk angličtina Země Nizozemsko Médium electronic-ecollection

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/pmid38312198

Grantová podpora
R35 GM119770 NIGMS NIH HHS - United States

Odkazy

PubMed 38312198
PubMed Central PMC10837063
DOI 10.1016/j.csbj.2024.01.013
PII: S2001-0370(24)00013-8
Knihovny.cz E-zdroje

Computational models of gene regulations help to understand regulatory mechanisms and are extensively used in a wide range of areas, e.g., biotechnology or medicine, with significant benefits. Unfortunately, there are only a few computational gene regulatory models of whole genomes allowing static and dynamic analysis due to the lack of sophisticated tools for their reconstruction. Here, we describe Augusta, an open-source Python package for Gene Regulatory Network (GRN) and Boolean Network (BN) inference from the high-throughput gene expression data. Augusta can reconstruct genome-wide models suitable for static and dynamic analyses. Augusta uses a unique approach where the first estimation of a GRN inferred from expression data is further refined by predicting transcription factor binding motifs in promoters of regulated genes and by incorporating verified interactions obtained from databases. Moreover, a refined GRN is transformed into a draft BN by searching in the curated model database and setting logical rules to incoming edges of target genes, which can be further manually edited as the model is provided in the SBML file format. The approach is applicable even if information about the organism under study is not available in the databases, which is typically the case for non-model organisms including most microbes. Augusta can be operated from the command line and, thus, is easy to use for automated prediction of models for various genomes. The Augusta package is freely available at github.com/JanaMus/Augusta. Documentation and tutorials are available at augusta.readthedocs.io.

Zobrazit více v PubMed

Arrieta-Ortiz M.L., et al. An experimentally supported model of the Bacillus subtilis global transcriptional regulatory network. Mol Syst Biol. 2015;11:839. PubMed PMC

Bailey T.L., et al. The MEME Suite. Nucleic Acids Res. 2015;43:W39–W49. PubMed PMC

Barbosa S., et al. A guide to gene regulatory network inference for obtaining predictive solutions: underlying assumptions and fundamental biological and data constraints. Biosystems. 2018;174:37–48. PubMed

Boulle M. Optimal bin number for equal frequency discretizations in supervized learning. Intell Data Anal. 2005;9:175–188.

Di Cara A., et al. Dynamic simulation of regulatory networks using SQUAD. BMC Bioinforma. 2007;8(1):10. PubMed PMC

Cellucci C.J., et al. Statistical validation of mutual information calculations: comparison of alternative numerical algorithms. Phys Rev E - Stat Nonlinear, Soft Matter Phys. 2005;71 PubMed

Chaouiya C., et al. SBML qualitative models: a model representation format and infrastructure to foster interactions between qualitative modelling formalisms and tools. BMC Syst Biol. 2013;7(1):15. PubMed PMC

Chen G., et al. Single-cell RNA-Seq technologies and related computational data analysis. Front Genet. 2019;10 PubMed PMC

Cooper S.J., et al. Comprehensive analysis of transcriptional promoter structure and function in 1% of the human genome. Genome Res. 2006;16:1. PubMed PMC

Csabai L., et al. SignaLink3: a multi-layered resource to uncover tissue-specific signaling networks. Nucleic Acids Res. 2022;50:D701–D709. PubMed PMC

Daniel Davies , 2020 EcoNameTranslator.

Dillies M.A., et al. A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Brief Bioinform. 2013;14:671–683. PubMed

Emmert-Streib F., et al. Gene regulatory networks and their applications: understanding biological and medical problems in terms of networks. Front Cell Dev Biol. 2014;2:38. PubMed PMC

Evans C., et al. Selecting between-sample RNA-Seq normalization methods from the perspective of their assumptions. Brief Bioinform. 2018;19:776. PubMed PMC

Gjerga E., et al. Converting networks to predictive logic models from perturbation signalling data with CellNOpt. Bioinformatics. 2020;36:4523–4524. PubMed PMC

Grenier F., et al. Complete genome sequence of Escherichia coli BW25113. Genome Announc. 2014;2:1038–1052. PubMed PMC

Han H., et al. TRRUST v2: an expanded reference database of human and mouse transcriptional regulatory interactions. Nucleic Acids Res. 2018;46:D380–D386. PubMed PMC

Helikar T., et al. The cell collective: toward an open and collaborative approach to systems biology. BMC Syst Biol. 2012;6:1–14. PubMed PMC

Hucka M., et al. The Systems Biology Markup Language (SBML): language specification for level 3 version 2 Core Release 2. J Integr Bioinform. 2019;16 PubMed PMC

Huynh-Thu V.A., et al. Inferring regulatory networks from expression data using tree-based methods. PLoS One. 2010;5 PubMed PMC

Iglesias-Martinez L.F., et al. KBoost: a new method to infer gene regulatory networks from gene expression data. Sci Rep 2021 111. 2021;11:1–13. PubMed PMC

Jung S., et al. Evaluation of data discretization methods to derive platform independent isoform expression signatures for multi-class tumor subtyping. BMC Genom. 2015;16 PubMed PMC

Kanhere A., Bansal M. Structural properties of promoters: similarities and differences between prokaryotes and eukaryotes. Nucleic Acids Res. 2005;33:3165. PubMed PMC

Khan Y., et al. Normalization of gene expression data revisited: the three viewpoints of the transcriptome in human skeletal muscle undergoing load-induced hypertrophy and why they matter. BMC Bioinforma. 2022;23:1–9. PubMed PMC

Kitano H. Systems biology: a brief overview. Science. 2002;295:1662–1664. PubMed

Kunst F., et al. The complete genome sequence of the gram-positive bacterium Bacillus subtilis. Nature. 1997;390:249–256. PubMed

Licata L., et al. SIGNOR 2.0, the SIGnaling network open resource 2.0: 2019 update. Nucleic Acids Res. 2020;48:D504–D510. PubMed PMC

Liu X., et al. Normalization methods for the analysis of unbalanced transcriptome data: a review. Front Bioeng Biotechnol. 2019;7:358. PubMed PMC

Marbach D., et al. Generating realistic in silico gene networks for performance assessment of reverse engineering methods. Marb, Daniel; Schaffter, Thomas; Mattiussi, Claudio; Flore, Dario (2009) Gener Realis silico gene Netw Perform Assess Reverse Eng Methods J Comput Biol. 2009;16(2):229–239. 229-39., 16. PubMed

Margolin A.A., et al. ARACNE: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinforma. 2006;7:1–15. PubMed PMC

Mercatelli D., et al. Gene regulatory network inference resources: a practical overview. Biochim Biophys Acta - Gene Regul Mech. 2020;1863 PubMed

Meyer P.E., et al. Minet: A r/bioconductor package for inferring large transcriptional networks using mutual information. BMC Bioinforma. 2008;9(1):10. PubMed PMC

Moerman T., et al. GRNBoost2 and Arboreto: efficient and scalable inference of gene regulatory networks. Bioinformatics. 2019;35:2159–2161. PubMed

Müssel C., et al. BoolNet—an R package for generation, reconstruction and analysis of Boolean networks. Bioinformatics. 2010;26:1378–1380. PubMed

Omony J., et al. Dynamic sporulation gene co-expression networks for Bacillus subtilis 168 and the food-borne isolate Bacillus amyloliquefaciens: a transcriptomic model. Microb Genom. 2018;4 PubMed PMC

Sedlar K., et al. A transcriptional response of Clostridium beijerinckii NRRL B-598 to a butanol shock. Biotechnol Biofuels. 2019;12 PubMed PMC

Sedlar K., et al. Complete genome sequence of Clostridium pasteurianum NRRL B-598, a non-type strain producing butanol. J Biotechnol. 2015;214:113–114. PubMed

Sedlar K., et al. Transcription profiling of butanol producer Clostridium beijerinckii NRRL B-598 using RNA-Seq. BMC Genom. 2018;19(1):13. PubMed PMC

Skok Gibbs C., et al. High-performance single-cell gene regulatory network inference at scale: the Inferelator 3.0. Bioinformatics. 2022;38:2519–2528. PubMed PMC

Tabach Y., et al. Wide-Scale Analysis of Human Functional Transcription Factor Binding Reveals a Strong Bias towards the Transcription Start Site. Plos One. 2007;2(8) PubMed PMC

Türei D., et al. OmniPath: guidelines and gateway for literature-curated signaling pathway resources. Nat Methods 2016 1312. 2016;13:966–967. PubMed

Villaverde A.F., et al. PREMER: a tool to infer biological networks. IEEE/ACM Trans Comput Biol Bioinforma. 2018;15:1193–1202. PubMed

Zhao Y., et al. TPM, FPKM, or normalized counts? a comparative study of quantification measures for the analysis of RNA-seq data from the NCI patient-derived models repository. J Transl Med. 2021;19:1–15. PubMed PMC

Zoppoli P., et al. TimeDelay-ARACNE: reverse engineering of gene networks from time-course data by an information theoretic approach. BMC Bioinforma. 2010;11(1):15. PubMed PMC

Zou M., Conzen S.D. A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data. Bioinformatics. 2005;21:71–79. PubMed

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...