G4Hunter
Dotaz
Zobrazit nápovědu
Hepatitis B virus (HBV) is one of the most dangerous human pathogenic viruses found in all corners of the world. Recent sequencing of ancient HBV viruses revealed that these viruses have accompanied humanity for several millenia. As G-quadruplexes are considered to be potential therapeutic targets in virology, we examined G-quadruplex-forming sequences (PQS) in modern and ancient HBV genomes. Our analyses showed the presence of PQS in all 232 tested HBV genomes, with a total number of 1258 motifs and an average frequency of 1.69 PQS per kbp. Notably, the PQS with the highest G4Hunter score in the reference genome is the most highly conserved. Interestingly, the density of PQS motifs is lower in ancient HBV genomes than in their modern counterparts (1.5 and 1.9/kb, respectively). This modern frequency of 1.90 is very close to the PQS frequency of the human genome (1.93) using identical parameters. This indicates that the PQS content in HBV increased over time to become closer to the PQS frequency in the human genome. No statistically significant differences were found between PQS densities in HBV lineages found in different continents. These results, which constitute the first paleogenomics analysis of G4 propensity, are in agreement with our hypothesis that, for viruses causing chronic infections, their PQS frequencies tend to converge evolutionarily with those of their hosts, as a kind of 'genetic camouflage' to both hijack host cell transcriptional regulatory systems and to avoid recognition as foreign material.
- MeSH
- biologická evoluce MeSH
- G-kvadruplexy * MeSH
- genom lidský MeSH
- genomika MeSH
- lidé MeSH
- paleontologie MeSH
- virus hepatitidy B * genetika MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
G-quadruplexes are non-B secondary structures with regulatory functions and therapeutic potential. Improvements in sequencing methods recently allowed the completion of the first human chromosome which is now available as a gapless, end-to-end assembly, with the previously remaining spaces filled and newly identified regions added. We compared the presence of G-quadruplex forming sequences in the current human reference genome (GRCh38) and in the new end-to-end assembly of the X chromosome constructed by high-coverage ultra-long-read nanopore sequencing. This comparison revealed that, even though the corrected length of the chromosome X assembly is surprisingly 1.14% shorter than expected, the number of G-quadruplex forming sequences found in this gapless chromosome is significantly higher, with 493 new motifs having G4Hunter scores above 1.4 and 23 new sequences with G4Hunter scores above 3.5. This observation reflects an improved precision of the new sequencing approaches and points to an underestimation of G-quadruplex propensity in the previous, widely used version of the human genome assembly, especially for motifs with a high G4Hunter score, expected to be very stable. These G-quadruplex forming sequences probably remained undiscovered in earlier genome datasets due to previously unsolved G-rich and repetitive genomic regions. These observations allow a precise targeting of these important regulatory regions.
- MeSH
- G-kvadruplexy * MeSH
- lidé MeSH
- lidské chromozomy X chemie genetika MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
G-quadruplexes contribute to the regulation of key molecular processes. Their utilization for antiviral therapy is an emerging field of contemporary research. Here we present comprehensive analyses of the presence and localization of putative G-quadruplex forming sequences (PQS) in all viral genomes currently available in the NCBI database (including subviral agents). The G4Hunter algorithm was applied to a pool of 11,000 accessible viral genomes representing 350 Mbp in total. PQS frequencies differ across evolutionary groups of viruses, and are enriched in repeats, replication origins, 5'UTRs and 3'UTRs. Importantly, PQS presence and localization is connected to viral lifecycles and corresponds to the type of viral infection rather than to nucleic acid type; while viruses routinely causing persistent infections in Metazoa hosts are enriched for PQS, viruses causing acute infections are significantly depleted for PQS. The unique localization of PQS identifies the importance of G-quadruplex-based regulation of viral replication and life cycle, providing a tool for potential therapeutic targeting.
- MeSH
- databáze nukleových kyselin * MeSH
- DNA virů genetika metabolismus MeSH
- G-kvadruplexy * MeSH
- genom virový * MeSH
- lidé MeSH
- virové nemoci * genetika metabolismus MeSH
- viry * genetika metabolismus MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
The importance of gene expression regulation in viruses based upon G-quadruplex may point to its potential utilization in therapeutic targeting. Here, we present analyses as to the occurrence of putative G-quadruplex-forming sequences (PQS) in all reference viral dsDNA genomes and evaluate their dependence on PQS occurrence in host organisms using the G4Hunter tool. PQS frequencies differ across host taxa without regard to GC content. The overlay of PQS with annotated regions reveals the localization of PQS in specific regions. While abundance in some, such as repeat regions, is shared by all groups, others are unique. There is abundance within introns of Eukaryota-infecting viruses, but depletion of PQS in introns of bacteria-infecting viruses. We reveal a significant positive correlation between PQS frequencies in dsDNA viruses and corresponding hosts from archaea, bacteria, and eukaryotes. A strong relationship between PQS in a virus and its host indicates their close coevolution and evolutionarily reciprocal mimicking of genome organization.
BACKGROUND: Influenza viruses are dangerous pathogens. Seventy-Seven genomes of recently emerged genotype 4 reassortant Eurasian avian-like H1N1 virus (G4-EA-H1N1) are currently available. We investigated the presence and variation of potential G-quadruplex forming sequences (PQS), which can serve as targets for antiviral treatment. RESULTS: PQS were identified in all 77 genomes. The total number of PQS in G4-EA-H1N1 genomes was 571. Interestingly, the number of PQS per genome in individual close relative viruses varied from 4 to 12. PQS were not randomly distributed in the 8 segments of the G4-EA-H1N1 genome, the highest frequency of PQS being found in the NP segment (1.39 per 1000 nt), which is considered a potential target for antiviral therapy. In contrast, no PQS was found in the NS segment. Analyses of variability pointed the importance of some PQS; even if genome variation of influenza virus is extreme, the PQS with the highest G4Hunter score is the most conserved in all tested genomes. G-quadruplex formation in vitro was experimentally confirmed using spectroscopic methods. CONCLUSIONS: The results presented here hint several G-quadruplex-forming sequences in G4-EA-H1N1 genomes, that could provide good therapeutic targets.
- MeSH
- chřipka lidská * MeSH
- G-kvadruplexy * MeSH
- genom virový MeSH
- genotyp MeSH
- lidé MeSH
- reassortantní viry genetika MeSH
- virus chřipky A, podtyp H1N1 * genetika MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
The importance of unusual DNA structures in the regulation of basic cellular processes is an emerging field of research. Amongst local non-B DNA structures, G-quadruplexes (G4s) have gained in popularity during the last decade, and their presence and functional relevance at the DNA and RNA level has been demonstrated in a number of viral, bacterial, and eukaryotic genomes, including humans. Here, we performed the first systematic search of G4-forming sequences in all archaeal genomes available in the NCBI database. In this article, we investigate the presence and locations of G-quadruplex forming sequences using the G4Hunter algorithm. G-quadruplex-prone sequences were identified in all archaeal species, with highly significant differences in frequency, from 0.037 to 15.31 potential quadruplex sequences per kb. While G4 forming sequences were extremely abundant in Hadesarchaea archeon (strikingly, more than 50% of the Hadesarchaea archaeon isolate WYZ-LMO6 genome is a potential part of a G4-motif), they were very rare in the Parvarchaeota phylum. The presence of G-quadruplex forming sequences does not follow a random distribution with an over-representation in non-coding RNA, suggesting possible roles for ncRNA regulation. These data illustrate the unique and non-random localization of G-quadruplexes in Archaea.
- MeSH
- Archaea klasifikace genetika metabolismus MeSH
- archeální proteiny genetika metabolismus MeSH
- cirkulární dichroismus MeSH
- DNA vazebné proteiny genetika metabolismus MeSH
- DNA chemie genetika metabolismus MeSH
- druhová specificita MeSH
- fylogeneze MeSH
- G-kvadruplexy * MeSH
- genom archeí genetika MeSH
- genomika metody MeSH
- konformace nukleové kyseliny MeSH
- RNA chemie genetika metabolismus MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
Mechanisms of transcriptional control in malaria parasites are still not fully understood. The positioning patterns of G-quadruplex (G4) DNA motifs in the parasite's AT-rich genome, especially within the var gene family which encodes virulence factors, and in the vicinity of recombination hotspots, points towards a possible regulatory role of G4 in gene expression and genome stability. Here, we carried out the most comprehensive genome-wide survey, to date, of G4s in the Plasmodium falciparum genome using G4Hunter, which identifies G4 forming sequences (G4FS) considering their G-richness and G-skewness. We show an enrichment of G4FS in nucleosome-depleted regions and in the first exon of var genes, a pattern that is conserved within the closely related Laverania Plasmodium parasites. Under G4-stabilizing conditions, i.e., following treatment with pyridostatin (a high affinity G4 ligand), we show that a bona fide G4 found in the non-coding strand of var promoters modulates reporter gene expression. Furthermore, transcriptional profiling of pyridostatin-treated parasites, shows large scale perturbations, with deregulation affecting for instance the ApiAP2 family of transcription factors and genes involved in ribosome biogenesis. Overall, our study highlights G4s as important DNA secondary structures with a role in Plasmodium gene expression regulation, sub-telomeric recombination and var gene biology.
- MeSH
- aminochinoliny farmakologie MeSH
- G-kvadruplexy * MeSH
- genom účinky léků MeSH
- kyseliny pikolinové farmakologie MeSH
- lidé MeSH
- malárie farmakoterapie genetika parazitologie MeSH
- nukleotidové motivy genetika MeSH
- Plasmodium falciparum genetika patogenita MeSH
- promotorové oblasti (genetika) genetika MeSH
- regulace genové exprese účinky léků MeSH
- ribozomy účinky léků genetika MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
MOTIVATION: G-quadruplexes (G4) are important regulatory non-B DNA structures with therapeutic potential. A tool for rational design of mutations leading to decreased propensity for G4 formation should be useful in studying G4 functions. Although tools exist for G4 prediction, no easily accessible tool for the rational design of G4 mutations has been available. RESULTS: We developed a web-based tool termed G4Killer that is based on the G4Hunter algorithm. This new tool is a platform-independent and user-friendly application to design mutations crippling G4 propensity in a parsimonious way (i.e., keeping the primary sequence as close as possible to the original one). The tool is integrated into our DNA analyzer server and allows for generating mutated DNA sequences having the desired lowered G4Hunter score with minimal mutation steps. AVAILABILITY AND IMPLEMENTATION: The G4Killer web tool can be accessed at: http://bioinformatics.ibp.cz. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
- MeSH
- algoritmy MeSH
- DNA MeSH
- G-kvadruplexy * MeSH
- mutace MeSH
- sekvenční analýza DNA MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
MOTIVATION: Expanding research highlights the importance of guanine quadruplex structures. Therefore, easy-accessible tools for quadruplex analyses in DNA and RNA molecules are important for the scientific community. RESULTS: We developed a web version of the G4Hunter application. This new web-based server is a platform-independent and user-friendly application for quadruplex analyses. It allows retrieval of gene/nucleotide sequence entries from NCBI databases and provides complete characterization of localization and quadruplex propensity of quadruplex-forming sequences. The G4Hunter web application includes an interactive graphical data representation with many useful options including visualization, sorting, data storage and export. AVAILABILITY AND IMPLEMENTATION: G4Hunter web application can be accessed at: http://bioinformatics.ibp.cz. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
- MeSH
- DNA MeSH
- G-kvadruplexy * MeSH
- guanin MeSH
- internet MeSH
- počítače MeSH
- sekvenční analýza DNA MeSH
- software MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
G-quadruplexes (G4) are non-canonical DNA and/or RNA secondary structures formed in guanine-rich regions. Given their over-representation in specific regions in the genome such as promoters and telomeres, they are likely to play important roles in key processes such as transcription, replication or RNA maturation. Putative G4-forming sequences (G4FS) have been reported in humans, yeast, bacteria, viruses and many organisms. Here we present the first mapping of G-quadruplex sequences in Dictyostelium discoideum, the social amoeba. 'Dicty' is an ameboid protozoan with a small (34 Mb) and extremely AT rich genome (78%). As a consequence, very few G4-prone motifs are expected. An in silico analysis of the Dictyostelium genome with the G4Hunter software detected 249-1055 G4-prone motifs, depending on G4Hunter chosen threshold. Interestingly, despite an even lower GC content (as compared to the whole Dicty genome), the density of G4 motifs in Dictyostelium promoters and introns is significantly higher than in the rest of the genome. Fourteen selected sequences located in important genes were characterized by a combination of biophysical and biochemical techniques. Our data show that these sequences form highly stable G4 structures under physiological conditions. Five Dictyostelium genes containing G4-prone motifs in their promoters were studied for the effect of a new G4-binding porphyrin derivative on their expression. Our results demonstrated that the new ligand significantly decreased their expression. Overall, our results constitute the first step to adopt Dictyostelium discoideum as a 'G4-poor' model for studies on G-quadruplexes.