Most cited article - PubMed ID 29126205
Complex analyses of inverted repeats in mitochondrial genomes revealed their importance and variability
Noncanonical secondary structures in nucleic acids have been studied intensively in recent years. Important biological roles of cruciform structures formed by inverted repeats (IRs) have been demonstrated in diverse organisms, including humans. Using Palindrome analyser, we analyzed IRs in all accessible bacterial genome sequences to determine their frequencies, lengths, and localizations. IR sequences were identified in all species, but their frequencies differed significantly across various evolutionary groups. We detected 242,373,717 IRs in all 1,565 bacterial genomes. The highest mean IR frequency was detected in the Tenericutes (61.89 IRs/kbp) and the lowest mean frequency was found in the Alphaproteobacteria (27.08 IRs/kbp). IRs were abundant near genes and around regulatory, tRNA, transfer-messenger RNA (tmRNA), and rRNA regions, pointing to the importance of IRs in such basic cellular processes as genome maintenance, DNA replication, and transcription. Moreover, we found that organisms with high IR frequencies were more likely to be endosymbiotic, antibiotic producing, or pathogenic. On the other hand, those with low IR frequencies were far more likely to be thermophilic. This first comprehensive analysis of IRs in all available bacterial genomes demonstrates their genomic ubiquity, nonrandom distribution, and enrichment in genomic regulatory regions. IMPORTANCE Our manuscript reports for the first time a complete analysis of inverted repeats in all fully sequenced bacterial genomes. Thanks to the availability of unique computational resources, we were able to statistically evaluate the presence and localization of these important regulatory sequences in bacterial genomes. This work revealed a strong abundance of these sequences in regulatory regions and provides researchers with a valuable tool for their manipulation.
- Keywords
- Palindrome analyser, bacteria domain, bacterial genome analysis, inverted repeats,
- MeSH
- Bacteria genetics MeSH
- Phylogeny MeSH
- Genomics * MeSH
- Humans MeSH
- DNA Replication * MeSH
- Base Sequence MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
Epigenetics deals with changes in gene expression that are not caused by modifications in the primary sequence of nucleic acids. These changes beyond primary structures of nucleic acids not only include DNA/RNA methylation, but also other reversible conversions, together with histone modifications or RNA interference. In addition, under particular conditions (such as specific ion concentrations or protein-induced stabilization), the right-handed double-stranded DNA helix (B-DNA) can form noncanonical structures commonly described as "non-B DNA" structures. These structures comprise, for example, cruciforms, i-motifs, triplexes, and G-quadruplexes. Their formation often leads to significant differences in replication and transcription rates. Noncanonical RNA structures have also been documented to play important roles in translation regulation and the biology of noncoding RNAs. In human and animal studies, the frequency and dynamics of noncanonical DNA and RNA structures are intensively investigated, especially in the field of cancer research and neurodegenerative diseases. In contrast, noncanonical DNA and RNA structures in plants have been on the fringes of interest for a long time and only a few studies deal with their formation, regulation, and physiological importance for plant stress responses. Herein, we present a review focused on the main fields of epigenetics in plants and their possible roles in stress responses and signaling, with special attention dedicated to noncanonical DNA and RNA structures.
- Keywords
- Acetylation, Chromatin, Epigenetics, G-quadruplex, Gene expression, Histone, Methylation, Non-B DNA, Stress signaling,
- MeSH
- DNA genetics chemistry MeSH
- Epigenesis, Genetic MeSH
- G-Quadruplexes * MeSH
- Humans MeSH
- Nucleic Acids * MeSH
- RNA genetics chemistry MeSH
- Plants genetics MeSH
- Animals MeSH
- Check Tag
- Humans MeSH
- Animals MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Names of Substances
- DNA MeSH
- Nucleic Acids * MeSH
- RNA MeSH
Cruciforms occur when inverted repeat sequences in double-stranded DNA adopt intra-strand hairpins on opposing strands. Biophysical and molecular studies of these structures confirm their characterization as four-way junctions and have demonstrated that several factors influence their stability, including overall chromatin structure and DNA supercoiling. Here, we review our understanding of processes that influence the formation and stability of cruciforms in genomes, covering the range of sequences shown to have biological significance. It is challenging to accurately sequence repetitive DNA sequences, but recent advances in sequencing methods have deepened understanding about the amounts of inverted repeats in genomes from all forms of life. We highlight that, in the majority of genomes, inverted repeats are present in higher numbers than is expected from a random occurrence. It is, therefore, becoming clear that inverted repeats play important roles in regulating many aspects of DNA metabolism, including replication, gene expression, and recombination. Cruciforms are targets for many architectural and regulatory proteins, including topoisomerases, p53, Rif1, and others. Notably, some of these proteins can induce the formation of cruciform structures when they bind to DNA. Inverted repeat sequences also influence the evolution of genomes, and growing evidence highlights their significance in several human diseases, suggesting that the inverted repeat sequences and/or DNA cruciforms could be useful therapeutic targets in some cases.
- Keywords
- DNA base sequence, DNA structure, DNA supercoiling, cruciform, epigenetics, genome stability, inverted repeat, replication, transcription,
- MeSH
- DNA genetics MeSH
- Nucleic Acid Conformation MeSH
- DNA, Cruciform MeSH
- Humans MeSH
- Nucleic Acids * MeSH
- Inverted Repeat Sequences MeSH
- Repetitive Sequences, Nucleic Acid genetics MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Review MeSH
- Names of Substances
- DNA MeSH
- DNA, Cruciform MeSH
- Nucleic Acids * MeSH
The importance of unusual DNA structures in the regulation of basic cellular processes is an emerging field of research. Amongst local non-B DNA structures, G-quadruplexes (G4s) have gained in popularity during the last decade, and their presence and functional relevance at the DNA and RNA level has been demonstrated in a number of viral, bacterial, and eukaryotic genomes, including humans. Here, we performed the first systematic search of G4-forming sequences in all archaeal genomes available in the NCBI database. In this article, we investigate the presence and locations of G-quadruplex forming sequences using the G4Hunter algorithm. G-quadruplex-prone sequences were identified in all archaeal species, with highly significant differences in frequency, from 0.037 to 15.31 potential quadruplex sequences per kb. While G4 forming sequences were extremely abundant in Hadesarchaea archeon (strikingly, more than 50% of the Hadesarchaea archaeon isolate WYZ-LMO6 genome is a potential part of a G4-motif), they were very rare in the Parvarchaeota phylum. The presence of G-quadruplex forming sequences does not follow a random distribution with an over-representation in non-coding RNA, suggesting possible roles for ncRNA regulation. These data illustrate the unique and non-random localization of G-quadruplexes in Archaea.
- Keywords
- Archaea, G4-forming motif, genome analysis, sequence prediction, unusual nucleic acid structures,
- MeSH
- Archaea classification genetics metabolism MeSH
- Archaeal Proteins genetics metabolism MeSH
- Circular Dichroism MeSH
- DNA-Binding Proteins genetics metabolism MeSH
- DNA chemistry genetics metabolism MeSH
- Species Specificity MeSH
- Phylogeny MeSH
- G-Quadruplexes * MeSH
- Genome, Archaeal genetics MeSH
- Genomics methods MeSH
- Nucleic Acid Conformation MeSH
- RNA chemistry genetics metabolism MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Names of Substances
- Archaeal Proteins MeSH
- DNA-Binding Proteins MeSH
- DNA MeSH
- RNA MeSH
DNA is a fundamentally important molecule for all cellular organisms due to its biological role as the store of hereditary, genetic information. On the one hand, genomic DNA is very stable, both in chemical and biological contexts, and this assists its genetic functions. On the other hand, it is also a dynamic molecule, and constant changes in its structure and sequence drive many biological processes, including adaptation and evolution of organisms. DNA genomes contain significant amounts of repetitive sequences, which have divergent functions in the complex processes that involve DNA, including replication, recombination, repair, and transcription. Through their involvement in these processes, repetitive DNA sequences influence the genetic instability and evolution of DNA molecules and they are located non-randomly in all genomes. Mechanisms that influence such genetic instability have been studied in many organisms, including within human genomes where they are linked to various human diseases. Here, we review our understanding of short, simple DNA repeats across a diverse range of bacteria, comparing the prevalence of repetitive DNA sequences in different genomes. We describe the range of DNA structures that have been observed in such repeats, focusing on their propensity to form local, non-B-DNA structures. Finally, we discuss the biological significance of such unusual DNA structures and relate this to studies where the impacts of DNA metabolism on genetic stability are linked to human diseases. Overall, we show that simple DNA repeats in bacteria serve as excellent and tractable experimental models for biochemical studies of their cellular functions and influences.
- Keywords
- DNA metabolism, DNA structure, microsatellites, nucleic acids, repetitive DNA sequences,
- MeSH
- Bacteria genetics MeSH
- DNA genetics ultrastructure MeSH
- Genome, Bacterial genetics MeSH
- Genome, Human genetics MeSH
- Nucleic Acid Conformation MeSH
- Humans MeSH
- Microsatellite Repeats genetics MeSH
- Genomic Instability genetics MeSH
- Repetitive Sequences, Nucleic Acid genetics MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Review MeSH
- Names of Substances
- DNA MeSH
The role of local DNA structures in the regulation of basic cellular processes is an emerging field of research. Amongst local non-B DNA structures, the significance of G-quadruplexes was demonstrated in the last decade, and their presence and functional relevance has been demonstrated in many genomes, including humans. In this study, we analyzed the presence and locations of G-quadruplex-forming sequences by G4Hunter in all complete bacterial genomes available in the NCBI database. G-quadruplex-forming sequences were identified in all species, however the frequency differed significantly across evolutionary groups. The highest frequency of G-quadruplex forming sequences was detected in the subgroup Deinococcus-Thermus, and the lowest frequency in Thermotogae. G-quadruplex forming sequences are non-randomly distributed and are favored in various evolutionary groups. G-quadruplex-forming sequences are enriched in ncRNA segments followed by mRNAs. Analyses of surrounding sequences showed G-quadruplex-forming sequences around tRNA and regulatory sequences. These data point to the unique and non-random localization of G-quadruplex-forming sequences in bacterial genomes.
- Keywords
- G-quadruplex, G4Hunter, bacteria, bioinformatics, deinococcus,
- MeSH
- Bacteria genetics MeSH
- DNA, Bacterial chemistry MeSH
- Phylogeny MeSH
- G-Quadruplexes * MeSH
- Genome, Bacterial MeSH
- Nucleic Acid Conformation MeSH
- Humans MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- DNA, Bacterial MeSH
Chloroplasts are key organelles in the management of oxygen in algae and plants and are therefore crucial for all living beings that consume oxygen. Chloroplasts typically contain a circular DNA molecule with nucleus-independent replication and heredity. Using "palindrome analyser" we performed complete analyses of short inverted repeats (S-IRs) in all chloroplast DNAs (cpDNAs) available from the NCBI genome database. Our results provide basic parameters of cpDNAs including comparative information on localization, frequency, and differences in S-IR presence. In a total of 2,565 cpDNA sequences available, the average frequency of S-IRs in cpDNA genomes is 45 S-IRs/per kbp, significantly higher than that found in mitochondrial DNA sequences. The frequency of S-IRs in cpDNAs generally decreased with S-IR length, but not for S-IRs 15, 22, 24, or 27 bp long, which are significantly more abundant than S-IRs with other lengths. These results point to the importance of specific S-IRs in cpDNA genomes. Moreover, comparison by Levenshtein distance of S-IR similarities showed that a limited number of S-IR sequences are shared in the majority of cpDNAs. S-IRs are not located randomly in cpDNAs, but are length-dependently enriched in specific locations, including the repeat region, stem, introns, and tRNA regions. The highest enrichment was found for 12 bp and longer S-IRs in the stem-loop region followed by 12 bp and longer S-IRs located before the repeat region. On the other hand, S-IRs are relatively rare in rRNA sequences and around introns. These data show nonrandom and conserved arrangements of S-IRs in chloroplast genomes.
- MeSH
- Chloroplasts genetics MeSH
- DNA, Chloroplast * MeSH
- Phylogeny MeSH
- Genome, Chloroplast MeSH
- Introns MeSH
- Evolution, Molecular MeSH
- Inverted Repeat Sequences * MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- DNA, Chloroplast * MeSH
p73 is a member of the p53 protein family and has essential functions in several signaling pathways involved in development, differentiation, DNA damage responses and cancer. As a transcription factor, p73 achieves these functions by binding to consensus DNA sequences and p73 shares at least partial target DNA binding sequence specificity with p53. Transcriptional activation by p73 has been demonstrated for more than fifty p53 targets in yeast and/or human cancer cell lines. It has also been shown previously that p53 binding to DNA is strongly dependent on DNA topology and the presence of inverted repeats that can form DNA cruciforms, but whether p73 transcriptional activity has similar dependence has not been investigated. Therefore, we evaluated p73 binding to a set of p53-response elements with identical theoretical binding affinity in their linear state, but different probabilities to form extra helical structures. We show by a yeast-based assay that transactivation in vivo correlated more with the relative propensity of a response element to form cruciforms than to its expected in vitro DNA binding affinity. Structural features of p73 target sites are therefore likely to be an important determinant of its transactivation function.
- MeSH
- Transcriptional Activation MeSH
- Nucleic Acid Conformation MeSH
- Yeasts genetics metabolism MeSH
- Humans MeSH
- Tumor Suppressor Protein p53 metabolism MeSH
- Inverted Repeat Sequences * MeSH
- Tumor Protein p73 chemistry genetics metabolism MeSH
- Base Sequence MeSH
- Protein Binding MeSH
- Binding Sites * MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Names of Substances
- Tumor Suppressor Protein p53 MeSH
- Tumor Protein p73 MeSH