Augusta: From RNA-Seq to gene regulatory networks and Boolean models
Status PubMed-not-MEDLINE Jazyk angličtina Země Nizozemsko Médium electronic-ecollection
Typ dokumentu časopisecké články
Grantová podpora
R35 GM119770
NIGMS NIH HHS - United States
PubMed
38312198
PubMed Central
PMC10837063
DOI
10.1016/j.csbj.2024.01.013
PII: S2001-0370(24)00013-8
Knihovny.cz E-zdroje
- Klíčová slova
- Databases, Gene interactions, Mutual information, Python package, Transcription factor binding motifs,
- Publikační typ
- časopisecké články MeSH
Computational models of gene regulations help to understand regulatory mechanisms and are extensively used in a wide range of areas, e.g., biotechnology or medicine, with significant benefits. Unfortunately, there are only a few computational gene regulatory models of whole genomes allowing static and dynamic analysis due to the lack of sophisticated tools for their reconstruction. Here, we describe Augusta, an open-source Python package for Gene Regulatory Network (GRN) and Boolean Network (BN) inference from the high-throughput gene expression data. Augusta can reconstruct genome-wide models suitable for static and dynamic analyses. Augusta uses a unique approach where the first estimation of a GRN inferred from expression data is further refined by predicting transcription factor binding motifs in promoters of regulated genes and by incorporating verified interactions obtained from databases. Moreover, a refined GRN is transformed into a draft BN by searching in the curated model database and setting logical rules to incoming edges of target genes, which can be further manually edited as the model is provided in the SBML file format. The approach is applicable even if information about the organism under study is not available in the databases, which is typically the case for non-model organisms including most microbes. Augusta can be operated from the command line and, thus, is easy to use for automated prediction of models for various genomes. The Augusta package is freely available at github.com/JanaMus/Augusta. Documentation and tutorials are available at augusta.readthedocs.io.
Department of Biochemistry University of Nebraska Lincoln Lincoln 68588 NE USA
Department of Informatics Ludwig Maximilians Universität München Munich 80539 Germany
Institute of Forensic Engineering Brno University of Technology Brno 61200 Czech Republic
Zobrazit více v PubMed
Arrieta-Ortiz M.L., et al. An experimentally supported model of the Bacillus subtilis global transcriptional regulatory network. Mol Syst Biol. 2015;11:839. PubMed PMC
Bailey T.L., et al. The MEME Suite. Nucleic Acids Res. 2015;43:W39–W49. PubMed PMC
Barbosa S., et al. A guide to gene regulatory network inference for obtaining predictive solutions: underlying assumptions and fundamental biological and data constraints. Biosystems. 2018;174:37–48. PubMed
Boulle M. Optimal bin number for equal frequency discretizations in supervized learning. Intell Data Anal. 2005;9:175–188.
Di Cara A., et al. Dynamic simulation of regulatory networks using SQUAD. BMC Bioinforma. 2007;8(1):10. PubMed PMC
Cellucci C.J., et al. Statistical validation of mutual information calculations: comparison of alternative numerical algorithms. Phys Rev E - Stat Nonlinear, Soft Matter Phys. 2005;71 PubMed
Chaouiya C., et al. SBML qualitative models: a model representation format and infrastructure to foster interactions between qualitative modelling formalisms and tools. BMC Syst Biol. 2013;7(1):15. PubMed PMC
Chen G., et al. Single-cell RNA-Seq technologies and related computational data analysis. Front Genet. 2019;10 PubMed PMC
Cooper S.J., et al. Comprehensive analysis of transcriptional promoter structure and function in 1% of the human genome. Genome Res. 2006;16:1. PubMed PMC
Csabai L., et al. SignaLink3: a multi-layered resource to uncover tissue-specific signaling networks. Nucleic Acids Res. 2022;50:D701–D709. PubMed PMC
Daniel Davies , 2020 EcoNameTranslator.
Dillies M.A., et al. A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Brief Bioinform. 2013;14:671–683. PubMed
Emmert-Streib F., et al. Gene regulatory networks and their applications: understanding biological and medical problems in terms of networks. Front Cell Dev Biol. 2014;2:38. PubMed PMC
Evans C., et al. Selecting between-sample RNA-Seq normalization methods from the perspective of their assumptions. Brief Bioinform. 2018;19:776. PubMed PMC
Gjerga E., et al. Converting networks to predictive logic models from perturbation signalling data with CellNOpt. Bioinformatics. 2020;36:4523–4524. PubMed PMC
Grenier F., et al. Complete genome sequence of Escherichia coli BW25113. Genome Announc. 2014;2:1038–1052. PubMed PMC
Han H., et al. TRRUST v2: an expanded reference database of human and mouse transcriptional regulatory interactions. Nucleic Acids Res. 2018;46:D380–D386. PubMed PMC
Helikar T., et al. The cell collective: toward an open and collaborative approach to systems biology. BMC Syst Biol. 2012;6:1–14. PubMed PMC
Hucka M., et al. The Systems Biology Markup Language (SBML): language specification for level 3 version 2 Core Release 2. J Integr Bioinform. 2019;16 PubMed PMC
Huynh-Thu V.A., et al. Inferring regulatory networks from expression data using tree-based methods. PLoS One. 2010;5 PubMed PMC
Iglesias-Martinez L.F., et al. KBoost: a new method to infer gene regulatory networks from gene expression data. Sci Rep 2021 111. 2021;11:1–13. PubMed PMC
Jung S., et al. Evaluation of data discretization methods to derive platform independent isoform expression signatures for multi-class tumor subtyping. BMC Genom. 2015;16 PubMed PMC
Kanhere A., Bansal M. Structural properties of promoters: similarities and differences between prokaryotes and eukaryotes. Nucleic Acids Res. 2005;33:3165. PubMed PMC
Khan Y., et al. Normalization of gene expression data revisited: the three viewpoints of the transcriptome in human skeletal muscle undergoing load-induced hypertrophy and why they matter. BMC Bioinforma. 2022;23:1–9. PubMed PMC
Kitano H. Systems biology: a brief overview. Science. 2002;295:1662–1664. PubMed
Kunst F., et al. The complete genome sequence of the gram-positive bacterium Bacillus subtilis. Nature. 1997;390:249–256. PubMed
Licata L., et al. SIGNOR 2.0, the SIGnaling network open resource 2.0: 2019 update. Nucleic Acids Res. 2020;48:D504–D510. PubMed PMC
Liu X., et al. Normalization methods for the analysis of unbalanced transcriptome data: a review. Front Bioeng Biotechnol. 2019;7:358. PubMed PMC
Marbach D., et al. Generating realistic in silico gene networks for performance assessment of reverse engineering methods. Marb, Daniel; Schaffter, Thomas; Mattiussi, Claudio; Flore, Dario (2009) Gener Realis silico gene Netw Perform Assess Reverse Eng Methods J Comput Biol. 2009;16(2):229–239. 229-39., 16. PubMed
Margolin A.A., et al. ARACNE: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinforma. 2006;7:1–15. PubMed PMC
Mercatelli D., et al. Gene regulatory network inference resources: a practical overview. Biochim Biophys Acta - Gene Regul Mech. 2020;1863 PubMed
Meyer P.E., et al. Minet: A r/bioconductor package for inferring large transcriptional networks using mutual information. BMC Bioinforma. 2008;9(1):10. PubMed PMC
Moerman T., et al. GRNBoost2 and Arboreto: efficient and scalable inference of gene regulatory networks. Bioinformatics. 2019;35:2159–2161. PubMed
Müssel C., et al. BoolNet—an R package for generation, reconstruction and analysis of Boolean networks. Bioinformatics. 2010;26:1378–1380. PubMed
Omony J., et al. Dynamic sporulation gene co-expression networks for Bacillus subtilis 168 and the food-borne isolate Bacillus amyloliquefaciens: a transcriptomic model. Microb Genom. 2018;4 PubMed PMC
Sedlar K., et al. A transcriptional response of Clostridium beijerinckii NRRL B-598 to a butanol shock. Biotechnol Biofuels. 2019;12 PubMed PMC
Sedlar K., et al. Complete genome sequence of Clostridium pasteurianum NRRL B-598, a non-type strain producing butanol. J Biotechnol. 2015;214:113–114. PubMed
Sedlar K., et al. Transcription profiling of butanol producer Clostridium beijerinckii NRRL B-598 using RNA-Seq. BMC Genom. 2018;19(1):13. PubMed PMC
Skok Gibbs C., et al. High-performance single-cell gene regulatory network inference at scale: the Inferelator 3.0. Bioinformatics. 2022;38:2519–2528. PubMed PMC
Tabach Y., et al. Wide-Scale Analysis of Human Functional Transcription Factor Binding Reveals a Strong Bias towards the Transcription Start Site. Plos One. 2007;2(8) PubMed PMC
Türei D., et al. OmniPath: guidelines and gateway for literature-curated signaling pathway resources. Nat Methods 2016 1312. 2016;13:966–967. PubMed
Villaverde A.F., et al. PREMER: a tool to infer biological networks. IEEE/ACM Trans Comput Biol Bioinforma. 2018;15:1193–1202. PubMed
Zhao Y., et al. TPM, FPKM, or normalized counts? a comparative study of quantification measures for the analysis of RNA-seq data from the NCI patient-derived models repository. J Transl Med. 2021;19:1–15. PubMed PMC
Zoppoli P., et al. TimeDelay-ARACNE: reverse engineering of gene networks from time-course data by an information theoretic approach. BMC Bioinforma. 2010;11(1):15. PubMed PMC
Zou M., Conzen S.D. A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data. Bioinformatics. 2005;21:71–79. PubMed