Utilizing RNA-seq data in monotone iterative generalized linear model to elevate prior knowledge quality of the circRNA-miRNA-mRNA regulatory axis
Jazyk angličtina Země Anglie, Velká Británie Médium electronic
Typ dokumentu časopisecké články
Grantová podpora
SGS23/184/OHK3/3T/13
Grant Agency of the Czech Technical University in Prague
e-INFRA CZ project (ID:90254)
Ministry of Education, Youth and Sports of the Czech Republic
PubMed
40426030
PubMed Central
PMC12117772
DOI
10.1186/s12859-025-06161-w
PII: 10.1186/s12859-025-06161-w
Knihovny.cz E-zdroje
- Klíčová slova
- Bayesian network, Circular RNA, Functional annotation, Penalized regression, Structure inference,
- MeSH
- algoritmy MeSH
- genové regulační sítě MeSH
- kruhová RNA * genetika metabolismus MeSH
- lidé MeSH
- lineární modely MeSH
- messenger RNA * genetika metabolismus MeSH
- mikro RNA * genetika metabolismus MeSH
- sekvenční analýza RNA metody MeSH
- sekvenování transkriptomu * metody MeSH
- výpočetní biologie metody MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- kruhová RNA * MeSH
- messenger RNA * MeSH
- mikro RNA * MeSH
BACKGROUND: Current experimental data on RNA interactions remain limited, particularly for non-coding RNAs, many of which have only recently been discovered and operate within complex regulatory networks. Researchers often rely on in-silico interaction detection algorithms, such as TargetScan, which are based on biochemical sequence alignment. However, these algorithms have limited performance. RNA-seq expression data can provide valuable insights into regulatory networks, especially for understudied interactions such as circRNA-miRNA-mRNA. By integrating RNA-seq data with prior interaction networks obtained experimentally or through in-silico predictions, researchers can discover novel interactions, validate existing ones, and improve interaction prediction accuracy. RESULTS: This paper introduces Pi-GMIFS, an extension of the generalized monotone incremental forward stagewise (GMIFS) regression algorithm that incorporates prior knowledge. The algorithm first estimates prior response values through a prior-only regression, interpolates between these prior values and the original data, and then applies the GMIFS method. Our experimental results on circRNA-miRNA-mRNA regulatory interaction networks demonstrate that Pi-GMIFS consistently enhances precision and recall in RNA interaction prediction by leveraging implicit information from bulk RNA-seq expression data, outperforming the initial prior knowledge. CONCLUSION: Pi-GMIFS is a robust algorithm for inferring acyclic interaction networks when the variable ordering is known. Its effectiveness was confirmed through extensive experimental validation. We proved that RNA-seq data of a representative size help infer previously unknown interactions available in TarBase v9 and improve the quality of circRNA disease annotation.
Zobrazit více v PubMed
Reuter JA, Spacek DV, Snyder MP. High-throughput sequencing technologies. Mol Cell. 2015;58(4):586–97. 10.1016/j.molcel.2015.05.004. PubMed PMC
Subramanian I, Verma S, Kumar S, Jere A, Anamika K. Multi-omics data integration, interpretation, and its application. Bioinform Biol Insights. 2020;14:1177932219899051. PubMed PMC
Hawe JS, Theis FJ, Heinig M. Inferring interaction networks from multi-omics data. Front Genet. 2019. 10.3389/fgene.2019.00535. PubMed PMC
Emmert-Streib F, Dehmer M, Haibe-Kains B. Gene regulatory networks and their applications: understanding biological and medical problems in terms of networks. Front Cell Dev Biol. 2014. 10.3389/fcell.2014.00038. PubMed PMC
Kristensen LS, Andersen MS, Stagsted LVW, Ebbesen KK, Hansen TB, Kjems J. The biogenesis, biology and characterization of circular RNAs. Nat Rev Genet. 2019;20(11):675–91. 10.1038/s41576-019-0158-7. PubMed
Patop IL, Wüst S, Kadener S. Past, present, and future of circ RNA s. EMBO J. 2019. 10.15252/embj.2018100836. PubMed PMC
Sakshi S, Jayasuriya R, Ganesan K, Xu B, Ramkumar KM. Role of circRNA-miRNA-mRNA interaction network in diabetes and its associated complications. Mol Ther- Nucl Acids. 2021;26:1291–302. 10.1016/j.omtn.2021.11.007. PubMed PMC
Ala U. Competing endogenous RNAs, non-Coding RNAs and diseases: an intertwined story. Cells. 2020;9(7):1574. 10.3390/cells9071574. PubMed PMC
Demirci YM, Sacar Demirci MD. Circular RNA-MicroRNA-MRNA interaction predictions in SARS-CoV-2 infection. J Integr Bioinform. 2021;18(1):45–50. 10.1515/jib-2020-0047. PubMed PMC
Ayaz H, Aslam N, Awan FM, Basri R, Rauff B, Alzahrani B, Arif M, Ikram A, Obaid A, Naz A, Khan SN, Yang BB, Nazir A. Mapping CircRNA-miRNA-mRNA regulatory axis identifies hsa_circ_0080942 and hsa_circ_0080135 as a potential theranostic agents for SARS-CoV-2 infection. PLOS ONE. 2023;18(4):0283589. 10.1371/journal.pone.0283589. PubMed PMC
Kariuki D, Asam K, Aouizerat BE, Lewis KA, Florez JC, Flowers E. Review of databases for experimentally validated human microRNA-mRNA interactions. Database. 2023. 10.1093/database/baad014. PubMed PMC
Skoufos G, Kakoulidis P, Tastsoglou S, Zacharopoulou E, Kotsira V, Miliotis M, Mavromati G, Grigoriadis D, Zioga M, Velli A, Koutou I, Karagkouni D, Stavropoulos S, Kardaras FS, Lifousi A, Vavalou E, Ovsepian A, Skoulakis A, Tasoulis SK, Georgakopoulos SV, Plagianakos VP, Hatzigeorgiou AG. TarBase-v9.0 extends experimentally supported miRNA-gene interactions to cell-types and virally encoded miRNAs. Nucleic Acids Res. 2023;52:304–10. 10.1093/nar/gkad1071. PubMed PMC
Riolo G, Cantara S, Marzocchi C, Ricci C. miRNA targets: from prediction tools to experimental validation. Methods Protoc. 2020;4(1):1. 10.3390/mps4010001. PubMed PMC
Riolo G, Cantara S, Marzocchi C, Ricci C. miRNA targets: from prediction tools to experimental validation. Methods Protoc. 2020;4(1):1. 10.3390/mps4010001. PubMed PMC
Agarwal V, Bell GW, Nam J-W, Bartel DP. Predicting effective microRNA target sites in mammalian mRNAs. eLife. 2015;4:05005. PubMed PMC
Enright AJ, John B, Gaul U, Tuschl T, Sander C, Marks DS. MicroRNA targets in drosophila. Gen Biol. 2003. 10.1186/gb-2003-5-1-r1. PubMed PMC
Sun Z-Y, Yang C-L, Huang L-J, Mo Z-C, Zhang K-N, Fan W-H, Wang K-Y, Wu F, Wang J-G, Meng F-L, Zhao Z, Jiang T. circRNADisease v2.0 an updated resource for high-quality experimentally supported circRNA-disease associations. Nucl Acids Res. 2023;52:1193–200. 10.1093/nar/gkad949. PubMed PMC
Borella M, Martello G, Risso D, Romualdi C. PsiNorm: a scalable normalization for single-cell RNA-seq data. Bioinformatics. 2021;38(1):164–72. 10.1093/bioinformatics/btab641. PubMed PMC
Memczak S, Jens M, Elefsinioti A, Torti F, Krueger J, Rybak A, Maier L, Mackowiak SD, Gregersen LH, Munschauer M, Loewer A, Ziebold U, Landthaler M, Kocks C, Noble F, Rajewsky N. Circular RNAs are a large class of animal RNAs with regulatory potency. Nature. 2013;495(7441):333–8. 10.1038/nature11928. PubMed
Alexiou P, Maragkakis M, Papadopoulos GL, Reczko M, Hatzigeorgiou AG. Lost in translation: an assessment and perspective for computational microRNA target identification. Bioinformatics. 2009;25(23):3049–55. 10.1093/bioinformatics/btp565 (https://academic.oup.com/bioinformatics/article-pdf/25/23/3049/16891016/btp565.pdf). PubMed
Talukder A, Zhang W, Li X, Hu H. A deep learning method for miRNA/isomiR target detection. Sci Rep. 2022. 10.1038/s41598-022-14890-8. PubMed PMC
Fridrich A, Hazan Y, Moran Y. Too many false targets for microRNAs: challenges and pitfalls in prediction of miRNA targets and their gene ontology in model and non-model organisms. BioEssays. 2019. 10.1002/bies.201800169. PubMed PMC
Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009;10(1):57–63. 10.1038/nrg2484. PubMed PMC
Madhumita M, Paul S. A review on methods for predicting miRNA-mRNA regulatory modules. J Integr Bioinf. 2022. 10.1515/jib-2020-0048. PubMed PMC
Liu B, Li J, Tsykin A, Liu L, Gaur AB, Goodall GJ. Exploring complex miRNA-mRNA interactions with bayesian networks by splitting-averaging strategy. BMC Bioinf. 2009. 10.1186/1471-2105-10-408. PubMed PMC
Masegosa A, Moral S. New skeleton-based approaches for bayesian structure learning of bayesian networks. Appl Soft Comput. 2013;13:1110–20. 10.1016/j.asoc.2012.09.029.
Tsamardinos I, Brown LE, Aliferis CF. The max-min hill-climbing bayesian network structure learning algorithm. Mach Learn. 2006;65(1):31–78. 10.1007/s10994-006-6889-7.
Li Y, Ziebart BD. Distributionally robust skeleton learning of discrete bayesian networks. arXiv (2023). 10.48550/ARXIV.2311.06117 .
Aragam B, Gu J, Zhou Q. Learning large-scale bayesian networks with the sparsebn package. J Stat Softw. 2019;91(11):1.
Chickering DM. In: Fisher, D., Lenz, H.-J. (eds.) Learning bayesian networks is NP-Complete, pp. 121– 130. Springer, New York, NY 1996. 10.1007/978-1-4612-2404-4_12 .
Lan W, Tang Z, Liu M, Chen Q, Peng W, Chen YP, Pan Y. The large language models on biomedical data analysis: a survey. IEEE J Biomed Health Inf. 2025;25:1–13. 10.1109/jbhi.2025.3530794. PubMed
Williams S, Huckle J. Easy Problems That LLMs Get Wrong. arXiv (2024). 10.48550/ARXIV.2405.19616 .
Soneson C, Delorenzi M. A comparison of methods for differential expression analysis of RNA-seq data. BMC Bioinf. 2013. 10.1186/1471-2105-14-91. PubMed PMC
Law CW, Chen Y, Shi W, Smyth GK. voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 2014;15(2):29. 10.1186/gb-2014-15-2-r29. PubMed PMC
Risso D, Perraudeau F, Gribkova S, Dudoit S, Vert J-P. A general and flexible method for signal extraction from single-cell RNA-seq data. Nat Commun. 2018;9(1):284. 10.1038/s41467-017-02554-5. PubMed PMC
Robinson MD, McCarthy DJ, Smyth GK. edgeR: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2009;26(1):139–40. 10.1093/bioinformatics/btp616. PubMed PMC
Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):25. 10.1186/s13059-014-0550-8. PubMed PMC
Chen Y, Lun ATL, Smyth GK. From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline. F1000Research. 2016;5:1438. PubMed PMC
Wang L, Audenaert P, Michoel T. High-dimensional bayesian network inference from systems genetics data using genetic node ordering. Front Genet. 2019;10:1196. 10.3389/fgene.2019.01196. PubMed PMC
Lu Y, Zhou Y, Qu W, Deng M, Zhang C. A lasso regression model for the construction of microRNA-target regulatory networks. Bioinformatics. 2011;27(17):2406–13. 10.1093/bioinformatics/btr410. PubMed
Zou H, Hastie T. Regularization and variable selection via the Elastic net. J Royal Stat Soc Ser B: Stat Methodol. 2005;67(2):301–20. 10.1111/j.1467-9868.2005.00503.x.
Wang Z, Ma S, Wang C. Variable selection for zero-inflated and overdispersed data with application to health care demand in germany. Biom J. 2015;57(5):867–84. 10.1002/bimj.201400143. PubMed PMC
Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010;33(1):1–22. PubMed PMC
Lee KH, Pedroza C, Avritscher EBC, Mosquera RA, Tyson JE. Evaluation of negative binomial and zero-inflated negative binomial models for the analysis of zero-inflated count data: application to the telemedicine for children with medical complexity trial. Trials. 2023;24(1):25. 10.1186/s13063-023-07648-8. PubMed PMC
Lehman RR, Archer KJ. Penalized negative binomial models for modeling an overdispersed count outcome with a high-dimensional predictor space: Application predicting micronuclei frequency. PLOS ONE. 2019;14(1):0209923. 10.1371/journal.pone.0209923. PubMed PMC
Jiang Y, He Y, Zhang H. Variable selection with prior information for generalized linear models via the prior LASSO method. J Am Stat Assoc. 2016;111(513):355–76. 10.1080/01621459.2015.1008363. PubMed PMC
Edgar R. Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002;30(1):207–10. 10.1093/nar/30.1.207. PubMed PMC
Weinstein JN, Collisson EA, Mills GB, Shaw KRM, Ozenberger BA, Ellrott K, Shmulevich I, Sander C, Stuart JM. The cancer genome atlas pan-cancer analysis project. Nat Genet. 2013;45(10):1113–20. 10.1038/ng.2764. PubMed PMC
Griffiths-Jones, S.: miRBase: The microRNA Sequence Database, pp. 129–138. Humana Press. 10.1385/1-59745-123-1:129 . PubMed
Seal RL, Braschi B, Gray K, Jones TEM, Tweedie S, Haim-Vilmovsky L, Bruford EA. Genenames org the hgnc resources in 2023. Nucleic Acids Res. 2022;51(D1):1003–9. PubMed PMC
Trsova I, Hrustincova A, Krejcik Z, Kundrat D, Holoubek A, Staflova K, Janstova L, Vanikova S, Szikszai K, Klema J, Rysavy P, Belickova M, Kaisrlikova M, Vesela J, Cermak J, Jonasova A, Dostal J, Fric J, Musil J, Merkerova MD. Expression of circular RNAs in myelodysplastic neoplasms and their association with mutations in the splicing factor gene SF3B1. Molecular Oncology. 2023. 10.1002/1878-0261.13486. PubMed PMC
Glažar P, Papavasileiou P, Rajewsky N. circBase: a database for circular RNAs. RNA. 2014;20(11):1666–70. 10.1261/rna.043687.113. PubMed PMC
Dudekula DB, Panda AC, Grammatikakis I, De S, Abdelmohsen K, Gorospe M. CircInteractome: A web tool for exploring circular RNAs and their interacting proteins and microRNAs. RNA Biol. 2016;13(1):34–42. 10.1080/15476286.2015.1128065. (PMID: 26669964). PubMed PMC
Merkerova MD, Klema J, Kundrat D, Szikszai K, Krejcik Z, Hrustincova A, Trsova I, Le AV, Cermak J, Jonasova A, Belickova M. Noncoding RNAs and their response predictive value in azacitidine-treated patients with myelodysplastic syndrome and acute myeloid leukemia with myelodysplasia-related changes. Cancer Genom- Proteomics. 2022;19(2):205–28. PubMed PMC
Lorenzi L, Chiu H-S, Avila Cobos F, Gross S, Volders P-J, Cannoodt R, Nuytens J, Vanderheyden K, Anckaert J, Lefever S, Tay AP, Bony EJ, Trypsteen W, Gysens F, Vromman M, Goovaerts T, Hansen TB, Kuersten S, Nijs N, Taghon T, Vermaelen K, Bracke KR, Saeys Y, De Meyer T, Deshpande NP, Anande G, Chen T-W, Wilkins MR, Unnikrishnan A, De Preter K, Kjems J, Koster J, Schroth GP, Vandesompele J, Sumazin P, Mestdagh P. The RNA Atlas expands the catalog of human non-coding RNAs. Nat Biotechnol. 2021;39(11):1453–65. 10.1038/s41587-021-00936-1. PubMed
Liu S, Wang Y, Duan L, Cui D, Deng K, Dong Z, Wei S. Whole transcriptome sequencing identifies a competitive endogenous RNA network that regulates the immunity of bladder cancer. Heliyon. 2024;10(8):29344. 10.1016/j.heliyon.2024.e29344. PubMed PMC
Ru Y, Kechris KJ, Tabakoff B, Hoffman P, Radcliffe RA, Bowler R, Mahaffey S, Rossi S, Calin GA, Bemis L, Theodorescu D. The multiMiR R package and database: integration of microRNA-target interactions along with their disease and drug associations. Nucleic Acids Res. 2014;42(17):133–133. 10.1093/nar/gku631. PubMed PMC
Leclercq M, Diallo AB, Blanchette M. Prediction of human miRNA target genes using computationally reconstructed ancestral mammalian sequences. Nucleic Acids Res. 2016;45(2):556–66. 10.1093/nar/gkw1085. PubMed PMC
Haecker I, Renne R. HITS-CLIP and PAR-CLIP advance viral miRNA targetome analysis. Critical Rev Eukaryot Gene Exp. 2014;24(2):101–16. 10.1615/critreveukaryotgeneexpr.2014006367. PubMed PMC
Ryšavý P, Kléma J, Merkerová MD, circGPA,. circRNA functional annotation based on probability-generating functions. BMC Bioinf. 2022. 10.1186/s12859-022-04957-8. PubMed PMC
Wu W, Ji P, Zhao F. CircAtlas: an integrated resource of one million highly accurate circular RNAs from 1070 vertebrate transcriptomes. Genome Biol. 2020;21(1):25. 10.1186/s13059-020-02018-y. PubMed PMC
Kern F, Krammes L, Danz K, Diener C, Kehl T, Küchler O, Fehlmann T, Kahraman M, Rheinheimer S, Aparicio-Puerta E, Wagner S, Ludwig N, Backes C, Lenhof H-P, Briesen H, Hart M, Keller A, Meese E. Validation of human microRNA target pathways enables evaluation of target prediction tools. Nucleic Acids Res. 2020;49(1):127–44. 10.1093/nar/gkaa1161. PubMed PMC
Fan C, Lei X, Tie J, Zhang Y, Wu F.-X, Pan Y. CircR2Disease v2.0: An updated web server for experimentally validated circRNA-disease associations and its application. Genom, Proteomics, Bioinf. 2021;20(3):435–45. 10.1016/j.gpb.2021.10.002. PubMed PMC
Guo X-Y, He C-X, Wang Y-Q, Sun C, Li G-M, Su Q, Pan Q, Fan J-G. Circular RNA profiling and bioinformatic modeling identify its regulatory role in hepatic steatosis. BioMed Res Int. 2017;2017:1–13. 10.1155/2017/5936171. PubMed PMC
Yao D, Zhang L, Zheng M, Sun X, Lu Y, Liu P. Circ2Disease: a manually curated database of experimentally validated circRNAs in human disease. Sci Rep. 2018;8(1):1108. 10.1038/s41598-018-29360-3. PubMed PMC
Ryšavý P, Kléma J, Merkerová MD. 2024 GPACDA - circRNA-disease association prediction with generating polynomials . In: Bioinformatics and Biomedical Engineering. Lecture Notes in Computer Science, Springer, Cham;14848:33– 48 . 10.1007/978-3-031-64629-4_3
Gill J, King G. What to do when your Hessian is not invertible: alternatives to model respecification in nonlinear estimation. Sociol Methods & Res. 2004;33(1):54–87. 10.1177/0049124103262681.
Lan W, Dong Y, Zhang H, Li C, Chen Q, Liu J, Wang J, Chen Y-PP. Benchmarking of computational methods for predicting circrna-disease associations. Brief Bioinf. 2023;241:25. 10.1093/bib/bbac613. PubMed
Lan W, Dong Y, Chen Q, Liu J, Wang J, Chen Y-PP, Pan S. Ignscda: Predicting circrna-disease associations based on improved graph convolutional network and negative sampling. IEEE/ACM Trans Comput Biol Bioinf. 2022;19(6):3530–8. 10.1109/tcbb.2021.3111607. PubMed
Lan W, Li C, Chen Q, Yu N, Pan Y, Zheng Y, Chen Y-PP. Lgcda: Predicting circrna-disease association based on fusion of local and global features. IEEE/ACM Trans Comput Biol Bioinf. 2024;21(5):1413–22. 10.1109/tcbb.2024.3387913. PubMed
Lan W, Wang J, Li M, Liu J, Wu F-X, Pan Y. Predicting microrna-disease associations based on improved microrna and disease similarities. IEEE/ACM Trans Comput Biol Bioinf. 2018;15(6):1774–82. 10.1109/tcbb.2016.2586190. PubMed
Lan W, Dong Y, Chen Q, Zheng R, Liu J, Pan Y, Chen Y-PP. Kgancda: predicting circrna-disease associations based on knowledge graph attention network. Brief Bioinf. 2021;23(1):25. 10.1093/bib/bbab494. PubMed
Gillis J, Pavlidis P. Guilt by Association is the exception rather than the rule in gene networks. PLoS Comput Biol. 2012;8(3):1002444. 10.1371/journal.pcbi.1002444. PubMed PMC
Bradley AP. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 1997;30(7):1145–59. 10.1016/s0031-3203(96)00142-2.
Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez J-C, Müller M. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 2 12(1) 10.1186/1471-2105-12-77 PubMed PMC
Dhanakshirur M, Laumann F, Park J, Barahona M. A continuous structural intervention distance to compare causal graphs. arXiv (2023). 10.48550/ARXIV.2307.16452 .
Tay JK, Narasimhan B, Hastie T. Elastic net regularization paths for all generalized linear models. J Stat Softw. 2023;106:1. PubMed PMC
List M, Dehghani Amirabad A, Kostka D, Schulz MH. Large-scale inference of competing endogenous rna networks with sparse partial correlation. Bioinformatics. 2019;35(14):596–604. 10.1093/bioinformatics/btz314. PubMed PMC
Le TD, Zhang J, Liu L, Liu H, Li J. mirlab: An r based dry lab for exploring mirna-mrna regulatory relationships. PLOS ONE. 2015;10(12):0145386. 10.1371/journal.pone.0145386. PubMed PMC
Vo DHT, Thorne T. Shrinkage estimation of gene interaction networks in single-cell RNA sequencing data. BMC Bioinf. 2024;25(1):339. 10.1186/s12859-024-05946-9. PubMed PMC
Thorne T. Approximate inference of gene regulatory network models from RNA-Seq time series data. BMC Bioinf. 2018;19(1):1. 10.1186/s12859-018-2125-2. PubMed PMC
Degris, T., Javed, K., Sharifnassab, A., Liu, Y., Sutton, R.: Step-size optimization for continual learning. arXiv (2024). 10.48550/ARXIV.2401.17401 .