Most cited article - PubMed ID 31032840
rPredictorDB: a predictive database of individual secondary structures of RNAs and their formatted plots
Non-coding RNAs (ncRNA) are essential for all life, and their functions often depend on their secondary (2D) and tertiary structure. Despite the abundance of software for the visualisation of ncRNAs, few automatically generate consistent and recognisable 2D layouts, which makes it challenging for users to construct, compare and analyse structures. Here, we present R2DT, a method for predicting and visualising a wide range of RNA structures in standardised layouts. R2DT is based on a library of 3,647 templates representing the majority of known structured RNAs. R2DT has been applied to ncRNA sequences from the RNAcentral database and produced >13 million diagrams, creating the world's largest RNA 2D structure dataset. The software is amenable to community expansion, and is freely available at https://github.com/rnacentral/R2DT and a web server is found at https://rnacentral.org/r2dt .
- MeSH
- Databases, Nucleic Acid MeSH
- Nucleic Acid Conformation MeSH
- RNA, Untranslated chemistry MeSH
- Reproducibility of Results MeSH
- RNA chemistry MeSH
- Sequence Analysis, RNA MeSH
- Software MeSH
- Computational Biology methods MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Research Support, N.I.H., Intramural MeSH
- Research Support, U.S. Gov't, Non-P.H.S. MeSH
- Names of Substances
- RNA, Untranslated MeSH
- RNA MeSH
Searching for similar sequences in a database via BLAST or a similar tool is one of the most common bioinformatics tasks applied in general, and to non-coding RNAs in particular. However, the results of the search might be difficult to interpret due to the presence of partial matches to the database subject sequences. Here, we present rboAnalyzer - a tool that helps with interpreting sequence search result by (1) extending partial matches into plausible full-length subject sequences, (2) predicting homology of RNAs represented by full-length subject sequences to the query RNA, (3) pooling information across homologous RNAs found in the search results and public databases such as Rfam to predict more reliable secondary structures for all matches, and (4) contextualizing the matches by providing the prediction results and other relevant information in a rich graphical output. Using predicted full-length matches improves secondary structure prediction and makes rboAnalyzer robust with regards to identification of homology. The output of the tool should help the user to reliably characterize non-coding RNAs in BLAST output. The usefulness of the rboAnalyzer and its ability to correctly extend partial matches to full-length is demonstrated on known homologous RNAs. To allow the user to use custom databases and search options, rboAnalyzer accepts any search results as a text file in the BLAST format. The main output is an interactive HTML page displaying the computed characteristics and other context of the matches. The output can also be exported in an appropriate sequence and/or secondary structure formats.
- Keywords
- RNA, RNA homology, database, search, secondary structure, sequence,
- Publication type
- Journal Article MeSH