miRBench: novel benchmark datasets for microRNA binding site prediction that mitigate against prevalent microRNA frequency class bias
Jazyk angličtina Země Anglie, Velká Británie Médium print
Typ dokumentu časopisecké články
Grantová podpora
101086768
BioGeMT
RNS-2024-022
University of Malta and miRBench
Collaboration for microRNA Benchmarking
Xjenza Malta awarded to Panagiotis Alexiou
Bioinformatics Core Facility of CEITEC Masaryk University
LM2023067
NCMG Research Infrastructure
Novel Drug Targets for Infectious Diseases
COV.RD.2020-11
Malta Council for Science and Technology
Ministry of Education, Youth and Sports
Czech Republic
PubMed
40662834
PubMed Central
PMC12261448
DOI
10.1093/bioinformatics/btaf233
PII: 8199406
Knihovny.cz E-zdroje
- MeSH
- algoritmy MeSH
- benchmarking MeSH
- lidé MeSH
- mikro RNA * metabolismus genetika chemie MeSH
- neuronové sítě MeSH
- software * MeSH
- vazebná místa MeSH
- výpočetní biologie * metody MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- mikro RNA * MeSH
MOTIVATION: MicroRNAs (miRNAs) are crucial regulators of gene expression, but the precise mechanisms governing their binding to target sites remain unclear. A major contributing factor to this is the lack of unbiased experimental datasets for training accurate prediction models. While recent experimental advances have provided numerous miRNA-target interactions, these are solely positive interactions. Generating negative examples in silico is challenging and prone to introducing biases, such as the miRNA frequency class bias identified in this work. Biases within datasets can compromise model generalization, leading models to learn dataset-specific artifacts rather than true biological patterns. RESULTS: We introduce a novel methodology for negative sample generation that effectively mitigates the miRNA frequency class bias. Using this methodology, we curate several new, extensive datasets and benchmark several state-of-the-art methods on them. We find that a simple convolutional neural network model, retrained on some of these datasets, is able to outperform state-of-the-art methods reaching average precision scores between 0.81 and 0.86 in test datasets. This highlights the potential for leveraging unbiased datasets to achieve improved performance in miRNA binding site prediction. To facilitate further research and lower the barrier to entry for machine learning researchers, we provide an easily accessible Python package, miRBench, for dataset retrieval, sequence encoding, and the execution of state-of-the-art models. AVAILABILITY AND IMPLEMENTATION: The miRBench Python package is accessible at https://github.com/katarinagresova/miRBench/releases/tag/v1.0.1.
Central European Institute of Technology Masaryk University Brno 62500 Czech Republic
Centre for Molecular Medicine and Biobanking University of Malta Msida MSD 2080 Malta
Zobrazit více v PubMed
Alexiou P, Maragkakis M, Papadopoulos GL et al. Lost in translation: an assessment and perspective for computational microRNA target identification. Bioinformatics 2009;25:3049–55. PubMed
Bartel DP. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell 2004;116:281–97. PubMed
Bernstein E, Kim SY, Carmell MA et al. Dicer is essential for mouse development. Nat Genet 2003;35:215–7. PubMed
Broughton JP, Lovci MT, Huang JL et al. Pairing beyond the seed supports microRNA targeting specificity. Mol Cell 2016;64:320–33. PubMed PMC
Calin GA, Croce CM. MicroRNA signatures in human cancers. Nat Rev Cancer 2006;6:857–66. PubMed
Chakraborty C, Sharma AR, Sharma G et al. Therapeutic miRNA and siRNA: moving from bench to clinic as next generation medicine. Mol Ther Nucleic Acids 2017;8:132–43. PubMed PMC
Chou C-H, Chang N-W, Shrestha S et al. miRTarBase 2016: updates to the experimentally validated miRNA-target interactions database. Nucleic Acids Res 2016;44:D239–47. PubMed PMC
Condrat CE, Thompson DC, Barbu MG et al. miRNAs as biomarkers in disease: Latest findings regarding their role in diagnosis and prognosis. PubMed DOI PMC
Dai R, Ahmed SA. MicroRNA, a new paradigm for understanding immunoregulation, inflammation, and autoimmune diseases. Transl Res 2011;157:163–79. PubMed PMC
Didiano D, Hobert O. Perfect seed pairing is not a generally reliable predictor for miRNA–target interactions. Nat Struct Mol Biol 2006;13:849–51. PubMed
Esquela-Kerscher A, Slack FJ. Oncomirs—microRNAs with a role in cancer. Nat Rev Cancer 2006;6:259–69. PubMed
Grešová K, Alexiou P, Giassa I-C et al. Small RNA targets: advances in prediction tools and high-throughput profiling. Biology (Basel) 2022;11:1798. PubMed PMC
Grosswendt S, Filipchyk A, Manzano M et al. Unambiguous identification of miRNA: target site interactions by different types of ligation reactions. Mol Cell 2014;54:1042–54. PubMed PMC
Guo H, Viktor HL. Learning from imbalanced data sets with boosting and data generation. SIGKDD Explor 2004;6:30–9.
Hébert SS, De Strooper B. Alterations of the microRNA network cause neurodegenerative disease. Trends Neurosci 2009;32:199–206. PubMed
Hejret V, Varadarajan NM, Klimentova E et al. Analysis of chimeric reads characterises the diverse targetome of AGO2-mediated regulation. Sci Rep 2023;13:22895. PubMed PMC
He K et al. Deep residual learning for image recognition. arXiv [cs.CV]. 10.48550/arXiv.1512.03385, 2015, preprint: not peer reviewed. DOI
He L, Hannon GJ. MicroRNAs: small RNAs with a big role in gene regulation. Nat Rev Genet 2004;5:522–31. PubMed
Helwak A, Kudla G, Dudnakova T et al. Mapping the human miRNA interactome by CLASH reveals frequent noncanonical binding. Cell 2013;153:654–65. PubMed PMC
Hsu S-D, Lin F-M, Wu W-Y et al. miRTarBase: a database curates experimentally validated microRNA–target interactions. Nucleic Acids Res 2011;39:D163–9. PubMed PMC
Ikeda S, Kong SW, Lu J et al. Altered microRNA expression in human heart disease. Physiol Genomics 2007;31:367–73. PubMed
Ivey KN, Srivastava D. MicroRNAs as regulators of differentiation and cell fate decisions. Cell Stem Cell 2010;7:36–41. PubMed
Klimentová E, Hejret V, Krčmář J et al. miRBind: a deep learning method for miRNA binding classification. Genes (Basel) 2022;13:2323. PubMed PMC
Lee RC, Feinbaum RL, Ambros V et al. The PubMed
Liu J, Carmell MA, Rivas FV et al. Argonaute2 is the catalytic engine of mammalian RNAi. Science 2004;305:1437–41. PubMed
Lorenz R, Bernhart SH, Höner Zu Siederdissen C et al. ViennaRNA package 2.0. Algorithms Mol Biol 2011;6:26. PubMed PMC
Manakov SA et al. Scalable and deep profiling of mRNA targets for individual microRNAs with chimeric eCLIP. bioRxiv, 10.1101/2022.02.13.480296, 2022, preprint: not peer reviewed. DOI
McGeary SE, Bisaria N, Pham TM et al. MicroRNA 3′-compensatory pairing occurs through two binding modes, with affinity shaped by nucleotide identity and position. Elife 2022;11:e69803. PubMed PMC
McGeary SE, Lin KS, Shi CY et al. The biochemical basis of microRNA targeting efficacy. Science 2019;366:eaav1741. PubMed PMC
Min S, Lee B, Yoon S et al. TargetNet: functional microRNA target prediction with deep neural networks. Bioinformatics 2022;38:671–7. PubMed
Morita S, Horii T, Kimura M et al. One Argonaute family member, Eif2c2 (Ago2), is essential for development and appears not to be involved in DNA methylation. Genomics 2007;89:687–96. PubMed
O’Connell RM, Taganov KD, Boldin MP et al. MicroRNA-155 is induced during the macrophage inflammatory response. Proc Natl Acad Sci U S A 2007;104:1604–9. PubMed PMC
Pla A, Zhong X, Rayner S et al. miRAW: a deep learning-based approach to predict microRNA targets by analyzing whole microRNA transcripts. PLoS Comput Biol 2018;14:e1006185. PubMed PMC
Rad SMAH, Halpin JC, Tawinwung S et al. MicroRNA-mediated metabolic reprogramming of chimeric antigen receptor T cells. Immunol Cell Biol 2022;100:424–39. PubMed PMC
Van Nostrand E, Pratt G Shishkin A et al. Robust transcriptome wide discovery of RNA-binding protein binding sites with enhanced CLIP (eCLIP). Nat Methods 2016;13:508–14. 10.1038/nmeth.3810 PubMed DOI PMC
van Rooij E, Sutherland LB, Liu N et al. A signature pattern of stress-responsive microRNAs that can evoke cardiac hypertrophy and heart failure. Proc Natl Acad Sci U S A 2006;103:18255–60. PubMed PMC
van Rooij E, Olson EN. MicroRNA therapeutics for cardiovascular disease: opportunities and obstacles. Nat Rev Drug Discov 2012;11:860–72. PubMed PMC
Rupaimoole R, Slack FJ. MicroRNA therapeutics: towards a new era for the management of cancer and other diseases. Nat Rev Drug Discov 2017;16:203–22. PubMed
Shen L, Yang J, Zuo C et al. Circular mRNA-based TCR-T offers a safe and effective therapeutic strategy for treatment of cytomegalovirus infection. Mol Ther 2024;32:168–84. PubMed PMC
Sonkoly E, Pivarcsi A. Advances in microRNAs: implications for immunity and inflammatory diseases. J Cell Mol Med 2009;13:24–38. PubMed PMC
Thum T, Condorelli G. Long noncoding RNAs and microRNAs in cardiovascular pathophysiology. Circ Res 2015;116:751–62. PubMed
Vlachos IS, Paraskevopoulou MD, Karagkouni D et al. DIANA-TarBase v7.0: indexing more than half a million experimentally supported miRNA: mRNA interactions. Nucleic Acids Res 2015;43:D153–9. PubMed PMC
Yang T-H, Chen J-C, Lee Y-H et al. Identifying human miRNA target sites via learning the interaction patterns between miRNA and mRNA segments. J Chem Inf Model 2024;64:2445–53. PubMed
Zhao Y, Samal E, Srivastava D et al. Serum response factor regulates a muscle-specific microRNA that targets Hand2 during cardiogenesis. Nature 2005;436:214–20. PubMed
Zheng X, Chen L, Li X et al. Prediction of miRNA targets by learning from interaction sequences. PLoS One 2020;15:e0232578. PubMed PMC