Nejvíce citovaný článek - PubMed ID 15016910
RNA conformational classes
The protein structure prediction problem has been solved for many types of proteins by AlphaFold. Recently, there has been considerable excitement to build off the success of AlphaFold and predict the 3D structures of RNAs. RNA prediction methods use a variety of techniques, from physics-based to machine learning approaches. We believe that there are challenges preventing the successful development of deep learning-based methods like AlphaFold for RNA in the short term. Broadly speaking, the challenges are the limited number of structures and alignments making data-hungry deep learning methods unlikely to succeed. Additionally, there are several issues with the existing structure and sequence data, as they are often of insufficient quality, highly biased and missing key information. Here, we discuss these challenges in detail and suggest some steps to remedy the situation. We believe that it is possible to create an accurate RNA structure prediction method, but it will require solving several data quality and volume issues, usage of data beyond simple sequence alignments, or the development of new less data-hungry machine learning methods.
- MeSH
- deep learning MeSH
- konformace nukleové kyseliny MeSH
- molekulární modely MeSH
- RNA * chemie metabolismus genetika MeSH
- sbalování RNA MeSH
- sekvenční seřazení MeSH
- software MeSH
- strojové učení MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- RNA * MeSH
A detailed description of the dnatco.datmos.org web server implementing the universal structural alphabet of nucleic acids is presented. It is capable of processing any mmCIF- or PDB-formatted files containing DNA or RNA molecules; these can either be uploaded by the user or supplied as the wwPDB or PDB-REDO structural database access code. The web server performs an assignment of the nucleic acid conformations and presents the results for the intuitive annotation, validation, modeling and refinement of nucleic acids.
- Klíčová slova
- annotation, nucleic acids, refinement, structural alphabets, validation,
- MeSH
- databáze nukleových kyselin MeSH
- DNA chemie MeSH
- internet MeSH
- konformace nukleové kyseliny MeSH
- molekulární modely MeSH
- RNA chemie MeSH
- software * MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- DNA MeSH
- RNA MeSH
By analyzing almost 120 000 dinucleotides in over 2000 nonredundant nucleic acid crystal structures, we define 96+1 diNucleotide Conformers, NtCs, which describe the geometry of RNA and DNA dinucleotides. NtC classes are grouped into 15 codes of the structural alphabet CANA (Conformational Alphabet of Nucleic Acids) to simplify symbolic annotation of the prominent structural features of NAs and their intuitive graphical display. The search for nontrivial patterns of NtCs resulted in the identification of several types of RNA loops, some of them observed for the first time. Over 30% of the nearly six million dinucleotides in the PDB cannot be assigned to any NtC class but we demonstrate that up to a half of them can be re-refined with the help of proper refinement targets. A statistical analysis of the preferences of NtCs and CANA codes for the 16 dinucleotide sequences showed that neither the NtC class AA00, which forms the scaffold of RNA structures, nor BB00, the DNA most populated class, are sequence neutral but their distributions are significantly biased. The reported automated assignment of the NtC classes and CANA codes available at dnatco.org provides a powerful tool for unbiased analysis of nucleic acid structures by structural and molecular biologists.
- MeSH
- biokatalýza MeSH
- DNA chemie klasifikace MeSH
- konformace nukleové kyseliny * MeSH
- nukleotidové motivy * MeSH
- nukleotidy chemie klasifikace MeSH
- reprodukovatelnost výsledků MeSH
- riboswitch MeSH
- ribozomy chemie metabolismus MeSH
- RNA katalytická chemie metabolismus MeSH
- RNA chemie klasifikace MeSH
- vazebná místa MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- DNA MeSH
- nukleotidy MeSH
- riboswitch MeSH
- RNA katalytická MeSH
- RNA MeSH
BACKGROUND: A growing number of crystal and NMR structures reveals a considerable structural polymorphism of DNA architecture going well beyond the usual image of a double helical molecule. DNA is highly variable with dinucleotide steps exhibiting a substantial flexibility in a sequence-dependent manner. An analysis of the conformational space of the DNA backbone and the enhancement of our understanding of the conformational dependencies in DNA are therefore important for full comprehension of DNA structural polymorphism. RESULTS: A detailed classification of local DNA conformations based on the technique of Fourier averaging was published in our previous work. However, this procedure requires a considerable amount of manual work. To overcome this limitation we developed an automatic classification method consisting of the combination of supervised and unsupervised approaches. A proposed workflow is composed of k-NN method followed by a non-hierarchical single-pass clustering algorithm. We applied this workflow to analyze 816 X-ray and 664 NMR DNA structures released till February 2013. We identified and annotated six new conformers, and we assigned four of these conformers to two structurally important DNA families: guanine quadruplexes and Holliday (four-way) junctions. We also compared populations of the assigned conformers in the dataset of X-ray and NMR structures. CONCLUSIONS: In the present work we developed a machine learning workflow for the automatic classification of dinucleotide conformations. Dinucleotides with unassigned conformations can be either classified into one of already known 24 classes or they can be flagged as unclassifiable. The proposed machine learning workflow permits identification of new classes among so far unclassifiable data, and we identified and annotated six new conformations in the X-ray structures released since our previous analysis. The results illustrate the utility of machine learning approaches in the classification of local DNA conformations.
- MeSH
- algoritmy MeSH
- DNA chemie MeSH
- G-kvadruplexy MeSH
- klasifikace metody MeSH
- konformace nukleové kyseliny MeSH
- krystalografie rentgenová MeSH
- nukleární magnetická rezonance biomolekulární MeSH
- průběh práce MeSH
- shluková analýza MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- DNA MeSH
Folded RNA molecules are shaped by an astonishing variety of highly conserved noncanonical molecular interactions and backbone topologies. The dinucleotide platform is a widespread recurrent RNA modular building submotif formed by the side-by-side pairing of bases from two consecutive nucleotides within a single strand, with highly specific sequence preferences. This unique arrangement of bases is cemented by an intricate network of noncanonical hydrogen bonds and facilitated by a distinctive backbone topology. The present study investigates the gas-phase intrinsic stabilities of the three most common RNA dinucleotide platforms - 5'-GpU-3', ApA, and UpC - via state-of-the-art quantum-chemical (QM) techniques. The mean stability of base-base interactions decreases with sequence in the order GpU > ApA > UpC. Bader's atoms-in-molecules analysis reveals that the N2(G)…O4(U) hydrogen bond of the GpU platform is stronger than the corresponding hydrogen bonds in the other two platforms. The mixed-pucker sugar-phosphate backbone conformation found in most GpU platforms, in which the 5'-ribose sugar (G) is in the C2'-endo form and the 3'-sugar (U) in the C3'-endo form, is intrinsically more stable than the standard A-RNA backbone arrangement, partially as a result of a favorable O2'…O2P intra-platform interaction. Our results thus validate the hypothesis of Lu et al. (Lu Xiang-Jun, et al. Nucleic Acids Res. 2010, 38, 4868-4876), that the superior stability of GpU platforms is partially mediated by the strong O2'…O2P hydrogen bond. In contrast, ApA and especially UpC platform-compatible backbone conformations are rather diverse and do not display any characteristic structural features. The average stabilities of ApA and UpC derived backbone conformers are also lower than those of GpU platforms. Thus, the observed structural and evolutionary patterns of the dinucleotide platforms can be accounted for, to a large extent, by their intrinsic properties as described by modern QM calculations. In contrast, we show that the dinucleotide platform is not properly described in the course of atomistic explicit-solvent simulations. Our work also gives methodological insights into QM calculations of experimental RNA backbone geometries. Such calculations are inherently complicated by rather large data and refinement uncertainties in the available RNA experimental structures, which often preclude reliable energy computations.
- Publikační typ
- časopisecké články MeSH
Functional RNA molecules such as ribosomal RNAs frequently contain highly conserved internal loops with a 5'-UAA/5'-GAN (UAA/GAN) consensus sequence. The UAA/GAN internal loops adopt distinctive structure inconsistent with secondary structure predictions. The structure has a narrow major groove and forms a trans Hoogsteen/Sugar edge (tHS) A/G base pair followed by an unpaired stacked adenine, a trans Watson-Crick/Hoogsteen (tWH) U/A base pair and finally by a bulged nucleotide (N). The structure is further stabilized by a three-adenine stack and base-phosphate interaction. In the ribosome, the UAA/GAN internal loops are involved in extensive tertiary contacts, mainly as donors of A-minor interactions. Further, this sequence can adopt an alternative 2D/3D pattern stabilized by a four-adenine stack involved in a smaller number of tertiary interactions. The solution structure of an isolated UAA/GAA internal loop shows substantially rearranged base pairing with three consecutive non-Watson-Crick base pairs. Its A/U base pair adopts an incomplete cis Watson-Crick/Sugar edge (cWS) A/U conformation instead of the expected Watson-Crick arrangement. We performed 3.1 µs of explicit solvent molecular dynamics (MD) simulations of the X-ray and NMR UAA/GAN structures, supplemented by MM-PBSA free energy calculations, locally enhanced sampling (LES) runs, targeted MD (TMD) and nudged elastic band (NEB) analysis. We compared parm99 and parmbsc0 force fields and net-neutralizing Na(+) vs. excess salt KCl ion environments. Both force fields provide a similar description of the simulated structures, with the parmbsc0 leading to modest narrowing of the major groove. The excess salt simulations also cause a similar effect. While the NMR structure is entirely stable in simulations, the simulated X-ray structure shows considerable widening of the major groove, loss of base-phosphate interaction and other instabilities. The alternative X-ray geometry even undergoes conformational transition towards the solution 2D structure. Free energy calculations confirm that the X-ray arrangement is less stable than the solution structure. LES, TMD and NEB provide a rather consistent pathway for interconversion between the X-ray and NMR structures. In simulations, the incomplete cWS A/U base pair of the NMR structure is water mediated and alternates with the canonical A-U base pair, which is not indicated by the NMR data. Completion of full cWS A/U base pair is prevented by the overall internal loop arrangement. In summary, the simulations confirm that the UAA/GAN internal loop is a molecular switch RNA module that adopts its functional geometry upon specific tertiary contexts.
- Publikační typ
- časopisecké články MeSH
The geometry of the phosphodiester backbone was analyzed for 7739 dinucleotides from 447 selected crystal structures of naked and complexed DNA. Ten torsion angles of a near-dinucleotide unit have been studied by combining Fourier averaging and clustering. Besides the known variants of the A-, B- and Z-DNA forms, we have also identified combined A + B backbone-deformed conformers, e.g. with alpha/gamma switches, and a few conformers with a syn orientation of bases occurring e.g. in G-quadruplex structures. A plethora of A- and B-like conformers show a close relationship between the A- and B-form double helices. A comparison of the populations of the conformers occurring in naked and complexed DNA has revealed a significant broadening of the DNA conformational space in the complexes, but the conformers still remain within the limits defined by the A- and B- forms. Possible sequence preferences, important for sequence-dependent recognition, have been assessed for the main A and B conformers by means of statistical goodness-of-fit tests. The structural properties of the backbone in quadruplexes, junctions and histone-core particles are discussed in further detail.
- MeSH
- A-DNA chemie MeSH
- cytosin chemie MeSH
- deoxyribonukleotidy chemie MeSH
- DNA vazebné proteiny chemie MeSH
- DNA chemie MeSH
- G-kvadruplexy MeSH
- konformace nukleové kyseliny MeSH
- křížová struktura DNA chemie MeSH
- ligandy MeSH
- nukleozomy chemie MeSH
- RNA chemie MeSH
- sekvence nukleotidů MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- A-DNA MeSH
- cytosin MeSH
- deoxyribonukleotidy MeSH
- DNA vazebné proteiny MeSH
- DNA MeSH
- křížová struktura DNA MeSH
- ligandy MeSH
- nukleozomy MeSH
- RNA MeSH
Explicit solvent molecular dynamics simulations (in total almost 800 ns including locally enhanced sampling runs) were applied with different ion conditions and with two force fields (AMBER and CHARMM) to characterize typical geometries adopted by the flanking bases in the RNA kissing-loop complexes. We focus on flanking base positions in multiple x-ray and NMR structures of HIV-1 DIS kissing complexes and kissing complex from the large ribosomal subunit of Haloarcula marismortui. An initial x-ray open conformation of bulged-out bases in HIV-1 DIS complexes, affected by crystal packing, tends to convert to a closed conformation formed by consecutive stretch of four stacked purine bases. This is in agreement with those recent crystals where the packing is essentially avoided. We also observed variants of the closed conformation with three stacked bases, while nonnegligible populations of stacked geometries with bulged-in bases were detected, too. The simulation results reconcile differences in positions of the flanking bases observed in x-ray and NMR studies. Our results suggest that bulged-out geometries are somewhat more preferred, which is in accord with recent experiments showing that they may mediate tertiary contacts in biomolecular assemblies or allow binding of aminoglycoside antibiotics.
- MeSH
- chemické modely * MeSH
- dimerizace MeSH
- HIV-1 chemie genetika MeSH
- konformace nukleové kyseliny MeSH
- molekulární modely * MeSH
- párování bází genetika MeSH
- počátek transkripce * MeSH
- počítačová simulace MeSH
- RNA virová chemie MeSH
- vazebná místa MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- RNA virová MeSH
The hepatitis delta virus (HDV) ribozyme is an RNA enzyme from the human pathogenic HDV. Cations play a crucial role in self-cleavage of the HDV ribozyme, by promoting both folding and chemistry. Experimental studies have revealed limited but intriguing details on the location and structural and catalytic functions of metal ions. Here, we analyze a total of approximately 200 ns of explicit-solvent molecular dynamics simulations to provide a complementary atomistic view of the binding of monovalent and divalent cations as well as water molecules to reaction precursor and product forms of the HDV ribozyme. Our simulations find that an Mg2+ cation binds stably, by both inner- and outer-sphere contacts, to the electronegative catalytic pocket of the reaction precursor, in a position to potentially support chemistry. In contrast, protonation of the catalytically involved C75 in the precursor or artificial placement of this Mg2+ into the product structure result in its swift expulsion from the active site. These findings are consistent with a concerted reaction mechanism in which C75 and hydrated Mg2+ act as general base and acid, respectively. Monovalent cations bind to the active site and elsewhere assisted by structurally bridging long-residency water molecules, but are generally delocalized.
- MeSH
- hořčík chemie MeSH
- kationty dvojmocné chemie MeSH
- kationty jednomocné chemie MeSH
- konformace nukleové kyseliny MeSH
- molekulární modely MeSH
- molekulární sekvence - údaje MeSH
- RNA katalytická chemie MeSH
- sekvence nukleotidů MeSH
- sodík chemie MeSH
- vazebná místa MeSH
- virus hepatitidy delta enzymologie MeSH
- voda chemie MeSH
- vodíková vazba MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Research Support, N.I.H., Extramural MeSH
- Názvy látek
- hořčík MeSH
- kationty dvojmocné MeSH
- kationty jednomocné MeSH
- RNA katalytická MeSH
- sodík MeSH
- voda MeSH
Explicit solvent molecular dynamics (MD) simulations were carried out for sarcin-ricin domain (SRD) motifs from 23S (Escherichia coli) and 28S (rat) rRNAs. The SRD motif consists of GAGA tetraloop, G-bulged cross-strand A-stack, flexible region and duplex part. Detailed analysis of the overall dynamics, base pairing, hydration, cation binding and other SRD features is presented. The SRD is surprisingly static in multiple 25 ns long simulations and lacks any non-local motions, with root mean square deviation (r.m.s.d.) values between averaged MD and high-resolution X-ray structures of 1-1.4 A. Modest dynamics is observed in the tetraloop, namely, rotation of adenine in its apex and subtle reversible shift of the tetraloop with respect to the adjacent base pair. The deformed flexible region in low-resolution rat X-ray structure is repaired by simulations. The simulations reveal few backbone flips, which do not affect positions of bases and do not indicate a force field imbalance. Non-Watson-Crick base pairs are rigid and mediated by long-residency water molecules while there are several modest cation-binding sites around SRD. In summary, SRD is an unusually stiff rRNA building block. Its intrinsic structural and dynamical signatures seen in simulations are strikingly distinct from other rRNA motifs such as Loop E and Kink-turns.
- MeSH
- endoribonukleasy metabolismus MeSH
- Escherichia coli genetika MeSH
- fungální proteiny metabolismus MeSH
- kationty chemie MeSH
- konformace nukleové kyseliny MeSH
- krysa rodu Rattus MeSH
- krystalografie rentgenová MeSH
- molekulární modely * MeSH
- párování bází MeSH
- počítačová simulace MeSH
- ricin metabolismus MeSH
- RNA ribozomální 23S chemie metabolismus MeSH
- RNA ribozomální 28S chemie metabolismus MeSH
- sacharidy chemie MeSH
- vazebná místa MeSH
- voda chemie MeSH
- vodíková vazba MeSH
- zvířata MeSH
- Check Tag
- krysa rodu Rattus MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Research Support, U.S. Gov't, Non-P.H.S. MeSH
- Názvy látek
- alpha-sarcin MeSH Prohlížeč
- endoribonukleasy MeSH
- fungální proteiny MeSH
- kationty MeSH
- ricin MeSH
- RNA ribozomální 23S MeSH
- RNA ribozomální 28S MeSH
- sacharidy MeSH
- voda MeSH