• This record comes from PubMed

When will RNA get its AlphaFold moment?

. 2023 Oct 13 ; 51 (18) : 9522-9532.

Language English Country England, Great Britain Media print

Document type Journal Article

Grant support
2019/35/B/ST6/03074 National Science Centre Poland
European Molecular Biology Laboratory
Politechnika Poznańska
LM2023055 ELIXIR CZ
RVO 86652036 Akademie Věd České Republiky

The protein structure prediction problem has been solved for many types of proteins by AlphaFold. Recently, there has been considerable excitement to build off the success of AlphaFold and predict the 3D structures of RNAs. RNA prediction methods use a variety of techniques, from physics-based to machine learning approaches. We believe that there are challenges preventing the successful development of deep learning-based methods like AlphaFold for RNA in the short term. Broadly speaking, the challenges are the limited number of structures and alignments making data-hungry deep learning methods unlikely to succeed. Additionally, there are several issues with the existing structure and sequence data, as they are often of insufficient quality, highly biased and missing key information. Here, we discuss these challenges in detail and suggest some steps to remedy the situation. We believe that it is possible to create an accurate RNA structure prediction method, but it will require solving several data quality and volume issues, usage of data beyond simple sequence alignments, or the development of new less data-hungry machine learning methods.

See more in PubMed

NCBI Resource Coordinators Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2018; 46:D8–D13. PubMed PMC

Cech T.R., Steitz J.A., Atkins J.F.. RNA worlds: New tools for deep exploration. 2019; NY: Cold Spring Harbor Laboratory Press.

Matzov D., Bashan A., Yonath A.. A bright future for antibiotics. Ann. Rev. Biochem. 2017; 86:567–583. PubMed

n.a. Big pharma craves slice of AI-based RNA drug discovery. Nat. Biotechnol. 2023; 41:305. PubMed

Tishchenko S., Kostareva O., Gabdulkhakov A., Mikhaylina A., Nikonova E., Nevskaya N., Sarskikh A., Piendl W., Garber M., Nikonov S.. Protein–RNA affinity of ribosomal protein L1 mutants does not correlate with the number of intermolecular interactions. Acta Crystallogr. D. 2015; 71:376–386. PubMed

Levitt M. Detailed molecular model for transfer ribonucleic acid. Nature. 1969; 224:759–763. PubMed

Massire C., Westhof E.. MANIP: an interactive tool for modelling RNA. J. Mol. Graph. Model. 1998; 16:197–205. PubMed

Das R., Baker D.. Automated de novo prediction of native-like RNA tertiary structures. Proc. Natl. Acad. Sci. U.S.A. 2007; 104:14664–14669. PubMed PMC

Sharma S., Ding F., Dokholyan N.V.. iFoldRNA: three-dimensional RNA structure prediction and folding. Bioinformatics. 2008; 24:1951–1952. PubMed PMC

Jonikas M.A., Radmer R.J., Laederach A., Das R., Pearlman S., Herschlag D., Altman R.B.. Coarse-grained modeling of large RNA molecules with knowledge-based potentials and structural filters. RNA. 2009; 15:189–199. PubMed PMC

Boniecki M.J., Lach G., Dawson W.K., Tomala K., Lukasz P., Soltysinski T., Rother K.M., Bujnicki J.M.. SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction. Nucleic Acids Res. 2016; 44:e63. PubMed PMC

Zhao C., Xu X., Chen S.-J.. Predicting RNA structure with Vfold. Methods Mol. Biol. 2017; 1654:3–15. PubMed PMC

Flores S.C., Wan Y., Russell R., Altman R.B.. Predicting RNA structure by multiple template homology modeling. Pac. Symp. Biocomput. 2010; 216–227. PubMed PMC

Rother M., Rother K., Puton T., Bujnicki J.M.. ModeRNA: a tool for comparative modeling of RNA 3D structure. Nucleic Acids Res. 2011; 39:4007–4022. PubMed PMC

Parisien M., Major F.. The MC-Fold and MC-Sym pipeline infers RNA structure from sequence data. Nature. 2008; 452:51–55. PubMed

Jossinet F., Ludwig T.E., Westhof E.. Assemble: an interactive graphical tool to analyze and build RNA architectures at the 2D and 3D levels. Bioinformatics. 2010; 26:2057–2059. PubMed PMC

Popenda M., Szachniuk M., Antczak M., Purzycka K.J., Lukasiak P., Bartol N., Blazewicz J., Adamiak R.W.. Automated 3D structure composition for large RNAs. Nucleic Acids Res. 2012; 40:e112. PubMed PMC

Zhao Y., Huang Y., Gong Z., Wang Y., Man J., Xiao Y.. Automated and fast building of three-dimensional RNA structures. Sci. Rep. 2012; 2:734. PubMed PMC

Townshend R. J.L., Eismann S., Watkins A.M., Rangan R., Karelina M., Das R., Dror R.O.. Geometric deep learning of RNA structure. Science. 2021; 373:1047–1051. PubMed PMC

Ramakers J., Blum C.F., König S., Harmeling S., Kollmann M.. De Novo prediction of RNA 3D structures with Deep Learning. 2021; bioRxiv doi:01 September 2021, preprint: not peer reviewed10.1101/2021.08.30.458226. PubMed DOI PMC

Pearce R., Omenn G.S., Zhang Y.. De novo RNA tertiary structure prediction at atomic resolution using geometric potentials from Deep Learning. 2022; bioRxiv doi:15 May 2022, preprint: not peer reviewed10.1101/2022.05.15.491755. DOI

Shen T., Hu Z., Peng Z., Chen J., Xiong P., Hong L., Zheng L., Wang Y., King I., Wang S.et al. .. E2Efold-3D: end-to-end deep learning method for accurate de novo RNA 3D structure prediction. 2022; arXiv doi:04 July 2022, preprint: not peer reviewedhttps://arxiv.org/abs/2207.01586.

Cruz J.A., Blanchet M.-F., Boniecki M., Bujnicki J.M., Chen S.-J., Cao S., Das R., Ding F., Dokholyan N.V., Flores S.C.et al. .. RNA-Puzzles: a CASP-like evaluation of RNA three-dimensional structure prediction. RNA. 2012; 18:610–625. PubMed PMC

Miao Z., Adamiak R.W., Antczak M., Boniecki M.J., Bujnicki J., Chen S.-J., Cheng C.Y., Cheng Y., Chou F.-C., Das R.et al. .. RNA-Puzzles Round IV: 3D structure predictions of four ribozymes and two aptamers. RNA. 2020; 26:982–995. PubMed PMC

Gumna J., Antczak M., Adamiak R.W., Bujnicki J.M., Chen S.-J., Ding F., Ghosh P., Li J., Mukherjee S., Nithin C.et al. .. Computational pipeline for reference-free comparative analysis of RNA 3D structures applied to SARS-CoV-2 UTR models. Int. J. Mol. Sci. 2022; 23:9630. PubMed PMC

Parisien M., Cruz J.A., Westhof E., Major F.. New metrics for comparing and assessing discrepancies between RNA 3D structures and models. RNA. 2009; 15:1875–1885. PubMed PMC

Zok T., Popenda M., Szachniuk M.. MCQ4Structures to compute similarity of molecule structures. Cent. Eur. J. Oper Res. 2014; 22:457–473.

Wiedemann J., Zok T., Milostan M., Szachniuk M.. LCS-TA to identify similar fragments in RNA 3D structures. BMC Bioinformatics. 2017; 18:456. PubMed PMC

Gong S., Zhang C., Zhang Y.. RNA-align: quick and accurate alignment of RNA 3D structures based on size-independent TM-scoreRNA. Bioinformatics. 2019; 35:4459–4461. PubMed PMC

Magnus M., Antczak M., Zok T., Wiedemann J., Lukasiak P., Cao Y., Bujnicki J.M., Westhof E., Szachniuk M., Miao Z.. RNA-Puzzles toolkit: a computational resource of RNA 3D structure benchmark datasets, structure manipulation, and evaluation tools. Nucleic Acids Res. 2020; 48:576–588. PubMed PMC

Carrascoza F., Antczak M., Miao Z., Westhof E., Szachniuk M.. Evaluation of the stereochemical quality of predicted RNA 3D models in the RNA-Puzzles submissions. RNA. 2022; 28:250–262. PubMed PMC

Moult J., Pedersen J.T., Judson R., Fidelis K.. A large-scale experiment to assess protein structure prediction methods. Proteins. 1995; 23:ii–v. PubMed

Scheraga H.A. Calculation of polypeptide conformation. Harvey Lect. 1969; 63:99–138. PubMed

Jumper J., Evans R., Pritzel A., Green T., Figurnov M., Ronneberger O., Tunyasuvunakool K., Bates R., Žídek A., Potapenko A.et al. .. Highly accurate protein structure prediction with AlphaFold. Nature. 2021; 596:583–589. PubMed PMC

AlQuraishi M. AlphaFold at CASP13. Bioinformatics. 2019; 35:4862–4865. PubMed PMC

Jumper J., Evans R., Pritzel A., Green T., Figurnov M., Ronneberger O., Tunyasuvunakool K., Bates R., Žídek A., Potapenko A.et al. .. Applying and improving AlphaFold at CASP14. Proteins: Struct. Funct. Bioinformatics. 2021; 89:1711–1721. PubMed PMC

Kryshtafovych A., Antczak M., Szachniuk M., Zok T., Kretsch R.C., Rangan R., Pham P., Das R., Robin X., Studer G.et al. .. New prediction categories in CASP15. Proteins: Struct. Funct. Bioinform. 2023; 91:1–8. PubMed PMC

Zhang J., Fei Y., Sun L., Zhang Q.C.. Advances and opportunities in RNA structure experimental determination and computational modeling. Nat. Methods. 2022; 19:1193–1207. PubMed

Wang S., Sun S., Li Z., Zhang R., Xu J.. Accurate de novo prediction of protein contact map by ultra-deep learning model. PLoS Comput. Biol. 2017; 13:e1005324. PubMed PMC

Adhikari B., Hou J., Cheng J.. DNCON2: improved protein contact prediction using two-level deep convolutional neural networks. Bioinformatics. 2018; 34:1466–1472. PubMed PMC

Hou J., Wu T., Cao R., Cheng J.. Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13. Proteins. 2019; 87:1165–1178. PubMed PMC

Du Z., Su H., Wang W., Ye L., Wei H., Peng Z., Anishchenko I., Baker D., Yang J.. The trRosetta server for fast and accurate protein structure prediction. Nat. Protoc. 2021; 16:5634–5651. PubMed

Kandathil S.M., Greener J.G., Lau A.M., Jones D.T.. Ultrafast end-to-end protein structure prediction enables high-throughput exploration of uncharacterized proteins. Proc. Natl. Acad. Sci. U.S.A. 2022; 119:e2113348119. PubMed PMC

Mirdita M., Schütze K., Moriwaki Y., Heo L., Ovchinnikov S., Steinegger M.. ColabFold: making protein folding accessible to all. Nat. Methods. 2022; 19:679–682. PubMed PMC

Zhang X., Zhang B., Freddolino P.L., Zhang Y.. CR-I-TASSER: assemble protein structures from cryo-EM density maps using deep convolutional neural networks. Nat. Methods. 2022; 19:195–204. PubMed PMC

Chowdhury R., Bouatta N., Biswas S., Floristean C., Kharkar A., Roy K., Rochereau C., Ahdritz G., Zhang J., Church G.M.et al. .. Single-sequence protein structure prediction using a language model and deep learning. Nat. Biotechnol. 2022; 40:1617–1623. PubMed PMC

Ferruz N., Schmidt S., Höcker B.. ProtGPT2 is a deep unsupervised language model for protein design. Nat. Commun. 2022; 13:4348. PubMed PMC

Brandes N., Ofer D., Peleg Y., Rappoport N., Linial M.. ProteinBERT: a universal deep-learning model of protein sequence and function. Bioinformatics. 2022; 38:2102–2110. PubMed PMC

Suddath F.L., Quigley G.J., McPherson A., Sneden D., Kim J.J., Kim S.H., Rich A.. Three-dimensional structure of yeast phenylalanine transfer RNA at 3.0angstroms resolution. Nature. 1974; 248:20–24. PubMed

Brown R.S., Dewan J.C., Klug A.. Crystallographic and biochemical investigation of the lead(II)-catalyzed hydrolysis of yeast phenylalanine tRNA. Biochemistry. 1985; 24:4785–4801. PubMed

Westhof E., Dumas P., Moras D.. Restrained refinement of two crystalline forms of yeast aspartic acid and phenylalanine transfer RNA crystals. Acta Crystallogr. A. 1988; 44:112–123. PubMed

Tuschl T., Gohlke C., Jovin T.M., Westhof E., Eckstein F.. A three-dimensional model for the hammerhead ribozyme based on fluorescence measurements. Science. 1994; 266:785–789. PubMed

Pley H.W., Flaherty K.M., McKay D.B.. Three-dimensional structure of a hammerhead ribozyme. Nature. 1994; 372:68–74. PubMed

Cate J.H., Gooding A.R., Podell E., Zhou K., Golden B.L., Kundrot C.E., Cech T.R., Doudna J.A.. Crystal structure of a group I ribozyme domain: principles of RNA packing. Science. 1996; 273:1678–1685. PubMed

Ban N., Nissen P., Hansen J., Moore P.B., Steitz T.A.. The complete atomic structure of the large ribosomal subunit at 2.4 A resolution. Science. 2000; 289:905–920. PubMed

Tocilj A., Schlünzen F., Janell D., Glühmann M., Hansen H.A., Harms J., Bashan A., Bartels H., Agmon I., Franceschi F.et al. .. The small ribosomal subunit from Thermus thermophilus at 4.5 A resolution: pattern fittings and the identification of a functional site. Proc. Natl. Acad. Sci. U.S.A. 1999; 96:14252–14257. PubMed PMC

Wimberly B.T., Brodersen D.E., Clemons W.M., Morgan-Warren R.J., Carter A.P., Vonrhein C., Hartsch T., Ramakrishnan V.. Structure of the 30S ribosomal subunit. Nature. 2000; 407:327–339. PubMed

Burley S.K., Berman H.M., Bhikadiya C., Bi C., Chen L., Costanzo L.D., Christie C., Dalenberg K., Duarte J.M., Dutta S.et al. .. RCSB Protein Data Bank: biological macromolecular structures enabling research and education in fundamental biology, biomedicine, biotechnology and energy. Nucleic Acids Res. 2018; 47:D464–D474. PubMed PMC

Adamczyk B., Antczak M., Szachniuk M.. RNAsolo: a repository of cleaned PDB-derived RNA 3D structures. Bioinformatics. 2022; 38:3668–3670. PubMed PMC

Lescoute A., Westhof E.. Topology of three-way junctions in folded RNAs. RNA. 2006; 12:83–93. PubMed PMC

Laing C., Schlick T.. Analysis of four-way junctions in RNA structures. J Mol. Biol. 2009; 390:547–559. PubMed PMC

Wiedemann J., Kaczor J., Milostan M., Zok T., Blazewicz J., Szachniuk M., Antczak M.. RNAloops: a database of RNA multiloops. Bioinformatics. 2022; 38:4200–4205. PubMed PMC

Stombaugh J., Zirbel C.L., Westhof E., Leontis N.B.. Frequency and isostericity of RNA base pairs. Nucleic Acids Res. 2009; 37:2294–2312. PubMed PMC

Leontis N.B., Westhof E.. A common motif organizes the structure of multi-helix loops in 16 S and 23 S ribosomal RNAs. J. Mol. Biol. 1998; 283:571–583. PubMed

Mir A., Chen J., Robinson K., Lendy E., Goodman J., Neau D., Golden B.L.. Two divalent metal ions and conformational changes play roles in the hammerhead ribozyme cleavage reaction. Biochemistry. 2015; 54:6369–6381. PubMed PMC

Gendron P., Lemieux S., Major F.. Quantitative analysis of nucleic acid three-dimensional structures. J. Mol. Biol. 2001; 308:919–936. PubMed

Yang H., Jossinet F., Leontis N., Chen L., Westbrook J., Berman H., Westhof E.. Tools for the automatic identification and classification of RNA base pairs. Nucleic Acids Res. 2003; 31:3450–3460. PubMed PMC

Sarver M., Zirbel C.L., Stombaugh J., Mokdad A., Leontis N.B.. FR3D: finding local and composite recurrent structural motifs in RNA 3D structures. J. Math. Biol. 2008; 56:215–252. PubMed PMC

Walen T., Chojnowski G., Gierski P., Bujnicki J.M.. ClaRNA: a classifier of contacts in RNA 3D structures based on a comparative analysis of various classification schemes. Nucleic Acids Res. 2014; 42:e151. PubMed PMC

Zok T., Antczak M., Zurkowski M., Popenda M., Blazewicz J., Adamiak R.W., Szachniuk M.. RNApdbee 2.0: multifunctional tool for RNA structure annotation. Nucleic Acids Res. 2018; 46:W30–W35. PubMed PMC

Danaee P., Rouches M., Wiley M., Deng D., Huang L., Hendrix D.. bpRNA: large-scale automated annotation and analysis of RNA secondary structure. Nucleic Acids Res. 2018; 46:5381–5394. PubMed PMC

Bottaro S., Bussi G., Pinamonti G., Reißer S., Boomsma W., Lindorff-Larsen K.. Barnaba: software for analysis of nucleic acid structures and trajectories. RNA. 2019; 25:219–231. PubMed PMC

Roy P., Bhattacharyya D.. Contact networks in RNA: a structural bioinformatics study with a new tool. J. Comput. Aided Mol. Des. 2022; 36:131–140. PubMed

Lu X.-J., Bussemaker H.J., Olson W.K.. DSSR: an integrated software tool for dissecting the spatial structure of RNA. Nucleic Acids Res. 2015; 43:e142. PubMed PMC

Schneider B., Bruno I., Burley S.K., Case D.A., Černý J., Das R., Egli M., Emsley P., Feng Z., Jaskolski M.et al. .. Nucleic acid valence geometry working group. Int. Union Crystallogr. Newslett. 2020; 28:https://www.iucr.org/news/newsletter/volume-28/number-4/nucleic-acid-valence-geometry-working-group.

Kowiel M., Brzezinski D., Jaskolski M.. Conformation-dependent restraints for polynucleotides: I. Clustering of the geometry of the phosphodiester group. Nucleic Acids Res. 2016; 44:8479–8489. PubMed PMC

Gilski M., Zhao J., Kowiel M., Brzezinski D., Turner D.H., Jaskolski M.. Accurate geometrical restraints for Watson–Crick base pairs. Acta Crystallogr. B Struct. Sci. Cryst. Eng. Mater. 2019; 75:235–245. PubMed PMC

Kowiel M., Brzezinski D., Gilski M., Jaskolski M.. Conformation-dependent restraints for polynucleotides: the sugar moiety. Nucleic Acids Res. 2020; 48:962–973. PubMed PMC

Kim S.-H., Berman H.M., Seeman N.C., Newton M.D.. Seven basic conformations of nucleic acid structural units. Acta Crystallogr. B. 1973; 29:703–710.

Murray L. J.W., Arendall 3rd W.B., Richardson D.C., Richardson J.S.. RNA backbone is rotameric. Proc. Natl. Acad. Sci. U.S.A. 2003; 13904–13909. PubMed PMC

Hershkovitz E., Tannenbaum E., Howerton S.B., Sheth A., Tannenbaum A., Williams L.D.. Automated identification of RNA conformational motifs: theory and application to the HM LSU 23S rRNA. Nucleic Acids Res. 2003; 31:6249–6257. PubMed PMC

Schneider B., Morávek Z., Berman H.M.. RNA conformational classes. Nucleic Acids Res. 2004; 32:1666–1677. PubMed PMC

Svozil D., Kalina J., Omelka M., Schneider B.. DNA conformations and their sequence preferences. Nucleic Acids Res. 2008; 36:3690–3706. PubMed PMC

Černý J., Božíková P., Svoboda J., Schneider B.. A unified dinucleotide alphabet describing both RNA and DNA structures. Nucleic Acids Res. 2020; 48:6367–6381. PubMed PMC

Kozomara A., Birgaoanu M., Griffiths-Jones S.. miRBase: from microRNA sequences to function. Nucleic Acids Res. 2019; 47:D155–D162. PubMed PMC

Fromm B., Domanska D., Høye E., Ovchinnikov V., Kang W., Aparicio-Puerta E., Johansen M., Flatmark K., Mathelier A., Hovig E.et al. .. MirGeneDB 2.0: the metazoan microRNA complement. Nucleic Acids Res. 2020; 48:D132–D141. PubMed PMC

Sonnhammer E.L., Eddy S.R., Durbin R.. Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins. 1997; 28:405–420. PubMed

Griffiths-Jones S., Bateman A., Marshall M., Khanna A., Eddy S.R.. Rfam: an RNA family database. Nucleic Acids Res. 2003; 31:439–441. PubMed PMC

Rothschild D., Susanto T.T., Spence J.P., Genuth N.R., Sinnott-Armstrong N., Pritchard J.K., Barna M.. A comprehensive rRNA variation atlas in health and disease. 2023; bioRxiv doi:02 February 2023, preprint: not peer reviewed10.1101/2023.01.30.526360. PubMed DOI

McCulloch W.S., Pitts W.. A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys. 1943; 5:115–133. PubMed

Krizhevsky A., Sutskever I., Hinton G.E.. Pereira F., Burges C.J., Bottou L., Weinberger K.Q.. ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems. 2012; 25:

Dean J., Corrado G., Monga R., Chen K., Devin M., Mao M., Ranzato M., Senior A., Tucker P., Yang K.et al. .. Large scale distributed deep networks. Adv. Neural. Inf. Process Syst. 2012; 25:1223–1231.

Zhang C., Zhang Y., Pyle A.M.. rMSA: a sequence search and alignment algorithm to improve RNA structure modeling. J. Mol. Biol. 2023; 435:167904. PubMed

Darwin Tree of Life Project Consortium Sequence locally, think globally: The Darwin Tree of Life Project. Proc. Natl. Acad. Sci. USA. 2022; 119:e2115642118. PubMed PMC

Gupta P.K. Earth Biogenome Project: present status and future plans. Trends Genet. 2022; 38:811–820. PubMed

Gao W., Yang A., Rivas E.. Thirteen dubious ways to detect conserved structural RNAs. IUBMB Life. 2022; 75:471–492. PubMed PMC

Ponce-Salvatierra A., Merdas Astha K., Nithin C., Ghosh P., Mukherjee S., Bujnicki J.M.. Computational modeling of RNA 3D structure based on experimental data. Biosci. Rep. 2019; 39:BSR20180430. PubMed PMC

Spitale R.C., Incarnato D.. Probing the dynamic RNA structurome and its functions. Nat. Rev. Genet. 2023; 24:178–196. PubMed PMC

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...