GERONIMO: A tool for systematic retrieval of structural RNAs in a broad evolutionary context

. 2022 Dec 28 ; 12 () : . [epub] 20231017

Jazyk angličtina Země Spojené státy americké Médium print-electronic

Typ dokumentu časopisecké články, práce podpořená grantem

Perzistentní odkaz   https://www.medvik.cz/link/pmid37848616

BACKGROUND: While web-based tools such as BLAST have made identifying conserved gene homologs appear easy, genes with variable sequences pose significant challenges. Functionally important noncoding RNAs (ncRNA) often show low sequence conservation due to genetic variations, including insertions and deletions. Rather than conserved sequences, these RNAs possess highly conserved structural features across a broad phylogenetic range. Such features can be identified using the covariance models approach, which combines sequence alignment with a secondary RNA structure consensus. However, running standard implementation of that approach (Infernal) requires advanced bioinformatics knowledge compared to user-friendly web services like BLAST. The issue is partially addressed by RNAcentral, which can be used to search for homologs across a broad range of ncRNA sequence collections from diverse organisms but not across the genome assemblies. RESULTS: Here, we present GERONIMO, which conducts evolutionary searches across hundreds of genomes in a fully automated way. It provides results extended with taxonomy context, as summary tables and visualizations, to facilitate analysis for user convenience. Additionally, GERONIMO supplements homologous sequences with genomic regions to analyze promoter motifs or gene collinearity, enhancing the validation of results. CONCLUSION: GERONIMO, built using Snakemake, has undergone extensive testing on hundreds of genomes, establishing itself as a valuable tool in the identification of ncRNA homologs across diverse taxonomic groups. Consequently, GERONIMO facilitates the investigation of the evolutionary patterns of functionally significant ncRNA players, whose understanding has previously been limited to individual organisms and close relatives.

Zobrazit více v PubMed

Hopper  AK, Phizicky  EM. tRNA transfers to the limelight. Genes Dev. 2003;17:162–80.. 10.1101/gad.1049103. PubMed DOI

Sloan  KE, Warda  AS, Sharma  S  et al.  Tuning the ribosome: the influence of rRNA modification on eukaryotic ribosome biogenesis and function. RNA Biol. 2017;14:1138–52.. 10.1080/15476286.2016.1259781. PubMed DOI PMC

Matera  AG, Terns  RM, Terns  MP. Non-coding RNAs: lessons from the small nuclear and small nucleolar RNAs. Nat Rev Mol Cell Biol. 2007;8:209–20.. 10.1038/nrm2124. PubMed DOI

Cech  TR, Steitz  JA. The noncoding RNA revolution—trashing old rules to forge new ones. Cell. 2014;157:77–94.. 10.1016/j.cell.2014.03.008. PubMed DOI

Decoding noncoding RNA. Nat Methods. 2022;19:1147–8.. 10.1038/s41592-022-01654-5. PubMed DOI

Lee  H, Zhang  Z, Krause  HM. Long noncoding RNAs and repetitive elements: junk or intimate evolutionary partners?. Trends Genet. 2019;35:892–902.. 10.1016/j.tig.2019.09.006. PubMed DOI

Singer  MS, Gottschling  DE. TLC1: template RNA component of saccharomyces cerevisiae telomerase. Science. 1994;266:404–9.. 10.1126/science.7545955. PubMed DOI

Richards  EJ, Ausubel  FM. Isolation of a higher eukaryotic telomere from Arabidopsis thaliana. Cell. 1988;53:127–36.. 10.1016/0092-8674(88)90494-1. PubMed DOI

Fajkus  P, Peška  V, Závodník  M, et al.  Telomerase RNAs in land plants. Nucleic Acids Res. 2019;47:9842–56.. 10.1093/nar/gkz695. PubMed DOI PMC

McGinnis  S, Madden  TL. BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res. 2004;32:W20–5.. 10.1093/nar/gkh435. PubMed DOI PMC

Nawrocki  EP, Eddy  SR. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics. 2013;29:2933–5.. 10.1093/bioinformatics/btt509. PubMed DOI PMC

Barquist  L, Burge  SW, Gardner  PP. Studying RNA homology and conservation with infernal: from single sequences to RNA families. Curr Protoc Bioinformatics. 2016;54:12.13.1–12.13.25.. 10.1002/cpbi.4. PubMed DOI PMC

Kalvari  I, Nawrocki  EP, Ontiveros-Palacios  N  et al.  Rfam 14: expanded coverage of metagenomic, viral and microRNA families. Nucleic Acids Res. 2021;49:D192–200.. 10.1093/nar/gkaa1047. PubMed DOI PMC

Stark  R, Grzelak  M, Hadfield  J. RNA sequencing: the teenage years. Nat Rev Genet. 2019;20:631–56.. 10.1038/s41576-019-0150-2. PubMed DOI

Reuter  JS, Mathews  DH. RNAstructure: software for RNA secondary structure prediction and analysis. BMC Bioinf. 2010;11:129. 10.1186/1471-2105-11-129. PubMed DOI PMC

Lorenz  R, Bernhart  SH, Höner Zu Siederdissen  C, et al.  ViennaRNA package 2.0. Algorithms Mol Biol. 2011;6:26. 10.1186/1748-7188-6-26. PubMed DOI PMC

Bernhart  SH, Hofacker  IL, Will  S  et al.  RNAalifold: improved consensus structure prediction for RNA alignments. BMC Bioinf. 2008;9:474. 10.1186/1471-2105-9-474. PubMed DOI PMC

Tan  Z, Fu  Y, Sharma  G, et al.  TurboFold II: RNA structural alignment and secondary structure prediction informed by multiple homologs. Nucleic Acids Res. 2017;45:11570–81.. 10.1093/nar/gkx815. PubMed DOI PMC

Zhang  J, Fei  Y, Sun  L  et al.  Advances and opportunities in RNA structure experimental determination and computational modeling. Nat Methods. 2022;19:1193–207.. 10.1038/s41592-022-01623-y. PubMed DOI

Szikszai  M, Wise  M, Datta  A  et al.  Deep learning models for RNA secondary structure prediction (probably) do not generalize across families. Bioinformatics. 2022;38:3892–9.. 10.1093/bioinformatics/btac415. PubMed DOI PMC

The RNAcentral Consortium . RNAcentral: a hub of information for non-coding RNA sequences. Nucleic Acids Res. 2019;47:D221–9.. 10.1093/nar/gky1034. PubMed DOI PMC

Kitts  PA, Church  DM, Thibaud-Nissen  F  et al.  Assembly: a resource for assembled genomes at NCBI. Nucleic Acids Res. 2016;44:D73–80.. 10.1093/nar/gkv1226. PubMed DOI PMC

Gibney  G, Baxevanis  AD. Searching NCBI databases using Entrez. Curr Protoc Bioinformatics. 2011;34:1.3.1–1.3.25.. 10.1002/0471250953.bi0103s34. PubMed DOI

The R Project for Statistical Computing . https://www.r-project.org/index.html.

Wickham  H, Averick  M, Bryan  J  et al.  Welcome to the Tidyverse. JOSS. 2019;4:1686. 10.21105/joss.01686. DOI

Winter, DJ. rentrez: an R package for the NCBI eUtils API. R J. 2017;9:520. 10.32614/RJ-2017-058. DOI

Camacho  C, Coulouris  G, Avagyan  V  et al.  BLAST+: architecture and applications. BMC Bioinf. 2009;10:421. 10.1186/1471-2105-10-421. PubMed DOI PMC

Menzel  P, Gorodkin  J, Stadler  PF. The tedious task of finding homologous noncoding RNA genes. RNA, 2009;15:2075–82.. 10.1261/rna.1556009. PubMed DOI PMC

Sweeney  BA, Hoksza  D, Nawrocki  EP, et al.  R2DT is a framework for predicting and visualising RNA secondary structure using templates. Nat Commun. 2021;3494;12. 10.1038/s41467-021-23555-5. PubMed DOI PMC

Rivas  E. Evolutionary conservation of rna sequence and structure. WIREs RNA. 2021;12:e1649. 10.1002/wrna.1649. PubMed DOI PMC

Gao  W, Yang  A, Rivas  E. Thirteen dubious ways to detect conserved structural RNAs. IUBMB Life. 2022.75: 471–92.. 10.1002/iub.2694. PubMed DOI PMC

Griffiths-Jones  S. Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res. 2004;33:D121–4.. 10.1093/nar/gki081. PubMed DOI PMC

Logeswaran  D, Li  Y, Podlevsky  JD, et al.  Monophyletic origin and divergent evolution of animal telomerase RNA. Mol Biol Evol. 2021;38:215–28.. 10.1093/molbev/msaa203. PubMed DOI PMC

Bernt  M, Donath  A, Jühling  F  et al.  MITOS: improved de novo metazoan mitochondrial genome annotation. Mol Phylogenet Evol. 2013;69:313–9.. 10.1016/j.ympev.2012.08.023. PubMed DOI

Lowe  TM, Eddy  SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25:955–64.. 10.1093/nar/25.5.955. PubMed DOI PMC

Lowe  TM, Chan  PP. tRNAscan-SE on-line: integrating search and context for analysis of transfer RNA genes. Nucleic Acids Res. 2016;44:W54–7.. 10.1093/nar/gkw413. PubMed DOI PMC

Kramer  ST, Gruenke  PR, Alam  KK  et al.  FASTAptameR 2.0: a web tool for combinatorial sequence selections. Mol Ther Nucleic Acids. 2022;29:862–70.. 10.1016/j.omtn.2022.08.030. PubMed DOI PMC

Gao  W, Jones  TA, Rivas  E. Discovery of 17 conserved structural RNAs in fungi. Nucleic Acids Res. 2021;49:6128–43.. 10.1093/nar/gkab355. PubMed DOI PMC

Dobzhansky  T. Nothing in biology makes sense except in the light of evolution. Am Biol Teach. 1973;35:125–9.. 10.2307/4444260. DOI

Fajkus  P, Kilar  A, Nelson  ADL  et al.  Evolution of plant telomerase RNAs: farther to the past, deeper to the roots. Nucleic Acids Res. 2021;49:7680–94.. 10.1093/nar/gkab545. PubMed DOI PMC

Fajkus  P, Adámik  M, Nelson  ADL  et al.  Telomerase RNA in Hymenoptera (Insecta) switched to plant/ciliate-like biogenesis. Nucleic Acids Res. 2023;51:420–33.. 10.1093/nar/gkac1202. PubMed DOI PMC

Kilar  A, Fajkus  P, Fajkus J. GERONIMO. WorkflowHub.  2023. 10.48546/workflowhub.workflow.547.1. DOI

Kilar  A, Fajkus  P, Fajkus  J. GERONIMO: gEnomic RNA hOmology aNd evolutIonary MOdeling. Figshare. 2023. 10.6084/m9.figshare.22266430.v2. DOI

Kilar  AM, Fajkus  P, Fajkus  J. Supporting data for “GERONIMO: A Tool for Systematic Retrieval of Structural RNAs in Broad Evolutionary Context.”  GigaScience Database. 2023. 10.5524/102438. PubMed DOI PMC

Nejnovějších 20 citací...

Zobrazit více v
Medvik | PubMed

GERONIMO: A tool for systematic retrieval of structural RNAs in a broad evolutionary context

. 2022 Dec 28 ; 12 () : . [epub] 20231017

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...