Most cited article - PubMed ID 21816114
Cruciform structures are a common DNA feature important for regulating biological processes
Recently published in vivo observations have highlighted the presence of cruciform structures within the genome, suggesting their potential significance in the rapid recognition of the target sequence for transcription factor binding. In this in vitro study, we investigate the organization and stability of the sense (coding) strand within the Serum Response Element of the c-Fos gene promoter (c-Fos SRE), specifically focusing on segments spanning 12 to 36 nucleotides, centered around the CArG-box. Through a thorough examination of UV absorption patterns with varying temperatures, we identified the emergence of a remarkably stable structure, which we conclusively characterized as a hairpin using complementary 1H NMR experiments. Our research decisively ruled out the formation of homoduplexes, as confirmed by supplementary fluorescence experiments. Utilizing molecular dynamics simulations with atomic distance constraints derived from NMR data, we explored the structural intricacies of the compact hairpin. Notably, the loop consisting of the six-membered A/T sequence demonstrated substantial stabilization through extensive stacking, non-canonical inter-base hydrogen bonding, and hydrophobic clustering of thymine methyl groups. These findings suggest the potential of the c-Fos SRE to adopt a cruciform structure (consisting of two opposing hairpins), potentially providing a topological recognition site for the SRF transcription factor under cellular conditions. Our results should inspire further biochemical and in vivo studies to explore the functional implications of these non-canonical DNA structures.
- Publication type
- Journal Article MeSH
Nucleic acids are not only static carriers of genetic information but also play vital roles in controlling cellular lifecycles through their fascinating structural diversity [...].
- MeSH
- DNA * chemistry metabolism MeSH
- Nucleic Acid Conformation * MeSH
- Humans MeSH
- RNA * chemistry metabolism MeSH
- Computational Biology * methods MeSH
- Check Tag
- Humans MeSH
- Publication type
- Introductory Journal Article MeSH
- Editorial MeSH
- Names of Substances
- DNA * MeSH
- RNA * MeSH
Non-canonical structures (NCS) refer to the various forms of DNA that differ from the B-conformation described by Watson and Crick. It has been found that these structures are usual components of the genome, actively participating in its essential functions. The present review is focused on the nine kinds of NCS appearing or likely to appear in human ribosomal DNA (rDNA): supercoiling structures, R-loops, G-quadruplexes, i-motifs, DNA triplexes, cruciform structures, DNA bubbles, and A and Z DNA conformations. We discuss the conditions of their generation, including their sequence specificity, distribution within the locus, dynamics, and beneficial and detrimental role in the cell.
- Keywords
- DNA quadruplexes, Non-canonical DNA, R-loops, rDNA,
- MeSH
- G-Quadruplexes * MeSH
- Nucleic Acid Conformation MeSH
- Humans MeSH
- DNA, Ribosomal genetics MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Review MeSH
- Names of Substances
- DNA, Ribosomal MeSH
Noncanonical secondary structures in nucleic acids have been studied intensively in recent years. Important biological roles of cruciform structures formed by inverted repeats (IRs) have been demonstrated in diverse organisms, including humans. Using Palindrome analyser, we analyzed IRs in all accessible bacterial genome sequences to determine their frequencies, lengths, and localizations. IR sequences were identified in all species, but their frequencies differed significantly across various evolutionary groups. We detected 242,373,717 IRs in all 1,565 bacterial genomes. The highest mean IR frequency was detected in the Tenericutes (61.89 IRs/kbp) and the lowest mean frequency was found in the Alphaproteobacteria (27.08 IRs/kbp). IRs were abundant near genes and around regulatory, tRNA, transfer-messenger RNA (tmRNA), and rRNA regions, pointing to the importance of IRs in such basic cellular processes as genome maintenance, DNA replication, and transcription. Moreover, we found that organisms with high IR frequencies were more likely to be endosymbiotic, antibiotic producing, or pathogenic. On the other hand, those with low IR frequencies were far more likely to be thermophilic. This first comprehensive analysis of IRs in all available bacterial genomes demonstrates their genomic ubiquity, nonrandom distribution, and enrichment in genomic regulatory regions. IMPORTANCE Our manuscript reports for the first time a complete analysis of inverted repeats in all fully sequenced bacterial genomes. Thanks to the availability of unique computational resources, we were able to statistically evaluate the presence and localization of these important regulatory sequences in bacterial genomes. This work revealed a strong abundance of these sequences in regulatory regions and provides researchers with a valuable tool for their manipulation.
- Keywords
- Palindrome analyser, bacteria domain, bacterial genome analysis, inverted repeats,
- MeSH
- Bacteria genetics MeSH
- Phylogeny MeSH
- Genomics * MeSH
- Humans MeSH
- DNA Replication * MeSH
- Base Sequence MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
Cruciforms occur when inverted repeat sequences in double-stranded DNA adopt intra-strand hairpins on opposing strands. Biophysical and molecular studies of these structures confirm their characterization as four-way junctions and have demonstrated that several factors influence their stability, including overall chromatin structure and DNA supercoiling. Here, we review our understanding of processes that influence the formation and stability of cruciforms in genomes, covering the range of sequences shown to have biological significance. It is challenging to accurately sequence repetitive DNA sequences, but recent advances in sequencing methods have deepened understanding about the amounts of inverted repeats in genomes from all forms of life. We highlight that, in the majority of genomes, inverted repeats are present in higher numbers than is expected from a random occurrence. It is, therefore, becoming clear that inverted repeats play important roles in regulating many aspects of DNA metabolism, including replication, gene expression, and recombination. Cruciforms are targets for many architectural and regulatory proteins, including topoisomerases, p53, Rif1, and others. Notably, some of these proteins can induce the formation of cruciform structures when they bind to DNA. Inverted repeat sequences also influence the evolution of genomes, and growing evidence highlights their significance in several human diseases, suggesting that the inverted repeat sequences and/or DNA cruciforms could be useful therapeutic targets in some cases.
- Keywords
- DNA base sequence, DNA structure, DNA supercoiling, cruciform, epigenetics, genome stability, inverted repeat, replication, transcription,
- MeSH
- DNA genetics MeSH
- Nucleic Acid Conformation MeSH
- DNA, Cruciform MeSH
- Humans MeSH
- Nucleic Acids * MeSH
- Inverted Repeat Sequences MeSH
- Repetitive Sequences, Nucleic Acid genetics MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Review MeSH
- Names of Substances
- DNA MeSH
- DNA, Cruciform MeSH
- Nucleic Acids * MeSH
Parasitic helminths infecting humans are highly prevalent infecting ∼2 billion people worldwide, causing inflammatory responses, malnutrition and anemia that are the primary cause of morbidity. In addition, helminth infections of cattle have a significant economic impact on livestock production, milk yield and fertility. The etiological agents of helminth infections are mainly Nematodes (roundworms) and Platyhelminths (flatworms). G-quadruplexes (G4) are unusual nucleic acid structures formed by G-rich sequences that can be recognized by specific G4 ligands. Here we used the G4Hunter Web Tool to identify and compare potential G4 sequences (PQS) in the nuclear and mitochondrial genomes of various helminths to identify G4 ligand targets. PQS are nonrandomly distributed in these genomes and often located in the proximity of genes. Unexpectedly, a Nematode, Ascaris lumbricoides, was found to be highly enriched in stable PQS. This species can tolerate high-stability G4 structures, which are not counter selected at all, in stark contrast to most other species. We experimentally confirmed G4 formation for sequences found in four different parasitic helminths. Small molecules able to selectively recognize G4 were found to bind to Schistosoma mansoni G4 motifs. Two of these ligands demonstrated potent activity both against larval and adult stages of this parasite.
- MeSH
- Helminths genetics MeSH
- G-Quadruplexes * MeSH
- Genome MeSH
- Nematoda * genetics MeSH
- Humans MeSH
- Ligands MeSH
- Parasites genetics MeSH
- Platyhelminths * genetics MeSH
- Cattle MeSH
- Animals MeSH
- Check Tag
- Humans MeSH
- Cattle MeSH
- Animals MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Names of Substances
- Ligands MeSH
Z-DNA and Z-RNA are functionally important left-handed structures of nucleic acids, which play a significant role in several molecular and biological processes including DNA replication, gene expression regulation and viral nucleic acid sensing. Most proteins that have been proven to interact with Z-DNA/Z-RNA contain the so-called Zα domain, which is structurally well conserved. To date, only eight proteins with Zα domain have been described within a few organisms (including human, mouse, Danio rerio, Trypanosoma brucei and some viruses). Therefore, this paper aimed to search for new Z-DNA/Z-RNA binding proteins in the complete PDB structures database and from the AlphaFold2 protein models. A structure-based similarity search found 14 proteins with highly similar Zα domain structure in experimentally-defined proteins and 185 proteins with a putative Zα domain using the AlphaFold2 models. Structure-based alignment and molecular docking confirmed high functional conservation of amino acids involved in Z-DNA/Z-RNA, suggesting that Z-DNA/Z-RNA recognition may play an important role in a variety of cellular processes.
- Keywords
- Z-DNA, Z-RNA, Zα domain, bioinformatics, protein binding,
- MeSH
- DNA-Binding Proteins chemistry metabolism MeSH
- Protein Interaction Domains and Motifs * MeSH
- Nucleic Acid Conformation MeSH
- Protein Conformation MeSH
- Models, Molecular * MeSH
- RNA-Binding Proteins chemistry metabolism MeSH
- RNA chemistry metabolism MeSH
- Amino Acid Sequence MeSH
- Molecular Dynamics Simulation MeSH
- Molecular Docking Simulation MeSH
- Protein Binding MeSH
- Binding Sites MeSH
- Structure-Activity Relationship MeSH
- DNA, Z-Form chemistry metabolism MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- DNA-Binding Proteins MeSH
- RNA-Binding Proteins MeSH
- RNA MeSH
- DNA, Z-Form MeSH
R-loops are common non-B nucleic acid structures formed by a three-stranded nucleic acid composed of an RNA-DNA hybrid and a displaced single-stranded DNA (ssDNA) loop. Because the aberrant R-loop formation leads to increased mutagenesis, hyper-recombination, rearrangements, and transcription-replication collisions, it is regarded as important in human diseases. Therefore, its prevalence and distribution in genomes are studied intensively. However, in silico tools for R-loop prediction are limited, and therefore, we have developed the R-loop tracker tool, which was implemented as a part of the DNA Analyser web server. This new tool is focused upon (1) prediction of R-loops in genomic DNA without length and sequence limitations; (2) integration of R-loop tracker results with other tools for nucleic acids analyses, including Genome Browser; (3) internal cross-evaluation of in silico results with experimental data, where available; (4) easy export and correlation analyses with other genome features and markers; and (5) enhanced visualization outputs. Our new R-loop tracker tool is freely accessible on the web pages of DNA Analyser tools, and its implementation on the web-based server allows effective analyses not only for DNA segments but also for full chromosomes and genomes.
- Keywords
- RNA–DNA hybrid, non-B structure, sequence analysis,
- MeSH
- Algorithms * MeSH
- DNA chemistry genetics MeSH
- Genomics methods MeSH
- Internet statistics & numerical data MeSH
- Humans MeSH
- Genomic Instability * MeSH
- R-Loop Structures * MeSH
- Software MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- DNA MeSH
SARS-CoV-2 is an intensively investigated virus from the order Nidovirales (Coronaviridae family) that causes COVID-19 disease in humans. Through enormous scientific effort, thousands of viral strains have been sequenced to date, thereby creating a strong background for deep bioinformatics studies of the SARS-CoV-2 genome. In this study, we inspected high-frequency mutations of SARS-CoV-2 and carried out systematic analyses of their overlay with inverted repeat (IR) loci and CpG islands. The main conclusion of our study is that SARS-CoV-2 hot-spot mutations are significantly enriched within both IRs and CpG island loci. This points to their role in genomic instability and may predict further mutational drive of the SARS-CoV-2 genome. Moreover, CpG islands are strongly enriched upstream from viral ORFs and thus could play important roles in transcription and the viral life cycle. We hypothesize that hypermethylation of these loci will decrease the transcription of viral ORFs and could therefore limit the progression of the disease.
- Keywords
- CpG methylation, SARS-CoV-2, hot spot, inverted repeats,
- MeSH
- COVID-19 virology MeSH
- CpG Islands * MeSH
- Genome, Viral MeSH
- Humans MeSH
- DNA Methylation MeSH
- Mutation * MeSH
- SARS-CoV-2 genetics MeSH
- Protein Binding MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
BACKGROUND: Influenza viruses are dangerous pathogens. Seventy-Seven genomes of recently emerged genotype 4 reassortant Eurasian avian-like H1N1 virus (G4-EA-H1N1) are currently available. We investigated the presence and variation of potential G-quadruplex forming sequences (PQS), which can serve as targets for antiviral treatment. RESULTS: PQS were identified in all 77 genomes. The total number of PQS in G4-EA-H1N1 genomes was 571. Interestingly, the number of PQS per genome in individual close relative viruses varied from 4 to 12. PQS were not randomly distributed in the 8 segments of the G4-EA-H1N1 genome, the highest frequency of PQS being found in the NP segment (1.39 per 1000 nt), which is considered a potential target for antiviral therapy. In contrast, no PQS was found in the NS segment. Analyses of variability pointed the importance of some PQS; even if genome variation of influenza virus is extreme, the PQS with the highest G4Hunter score is the most conserved in all tested genomes. G-quadruplex formation in vitro was experimentally confirmed using spectroscopic methods. CONCLUSIONS: The results presented here hint several G-quadruplex-forming sequences in G4-EA-H1N1 genomes, that could provide good therapeutic targets.
- Keywords
- G-quadruplex, G4Hunter, Influenza virus,
- MeSH
- Influenza, Human * MeSH
- G-Quadruplexes * MeSH
- Genome, Viral MeSH
- Genotype MeSH
- Humans MeSH
- Reassortant Viruses genetics MeSH
- Influenza A Virus, H1N1 Subtype * genetics MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH