Most cited article - PubMed ID 32628663
Genome wide distribution of G-quadruplexes and their impact on gene expression in malaria parasites
Non-canonical (non-B) DNA structures-e.g. bent DNA, hairpins, G-quadruplexes (G4s), Z-DNA, etc.-which form at certain sequence motifs (e.g. A-phased repeats, inverted repeats, etc.), have emerged as important regulators of cellular processes and drivers of genome evolution. Yet, they have been understudied due to their repetitive nature and potentially inaccurate sequences generated with short-read technologies. Here we comprehensively characterize such motifs in the long-read telomere-to-telomere (T2T) genomes of human, bonobo, chimpanzee, gorilla, Bornean orangutan, Sumatran orangutan, and siamang. Non-B DNA motifs are enriched at the genomic regions added to T2T assemblies and occupy 9%-15%, 9%-11%, and 12%-38% of autosomes and chromosomes X and Y, respectively. G4s and Z-DNA are enriched at promoters and enhancers, as well as at origins of replication. Repetitive sequences harbor more non-B DNA motifs than non-repetitive sequences, especially in the short arms of acrocentric chromosomes. Most centromeres and/or their flanking regions are enriched in at least one non-B DNA motif type, consistent with a potential role of non-B structures in determining centromeres. Our results highlight the uneven distribution of predicted non-B DNA structures across ape genomes and suggest their novel functions in previously inaccessible genomic regions.
- MeSH
- DNA * chemistry genetics MeSH
- G-Quadruplexes MeSH
- Genome, Human MeSH
- Genome * MeSH
- Hominidae * genetics MeSH
- Humans MeSH
- Nucleotide Motifs MeSH
- Pan troglodytes genetics MeSH
- Repetitive Sequences, Nucleic Acid MeSH
- Telomere * genetics MeSH
- Animals MeSH
- Check Tag
- Humans MeSH
- Animals MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- DNA * MeSH
Non-canonical (non-B) DNA structures-e.g., bent DNA, hairpins, G-quadruplexes (G4s), Z-DNA, etc.-which form at certain sequence motifs (e.g., A-phased repeats, inverted repeats, etc.), have emerged as important regulators of cellular processes and drivers of genome evolution. Yet, they have been understudied due to their repetitive nature and potentially inaccurate sequences generated with short-read technologies. Here we comprehensively characterize such motifs in the long-read telomere-to-telomere (T2T) genomes of human, bonobo, chimpanzee, gorilla, Bornean orangutan, Sumatran orangutan, and siamang. Non-B DNA motifs are enriched at the genomic regions added to T2T assemblies, and occupy 9-15%, 9-11%, and 12-38% of autosomes, and chromosomes X and Y, respectively. G4s and Z-DNA are enriched at promoters and enhancers, as well as at origins of replication. Repetitive sequences harbor more non-B DNA motifs than non-repetitive sequences, especially in the short arms of acrocentric chromosomes. Most centromeres and/or their flanking regions are enriched in at least one non-B DNA motif type, consistent with a potential role of non-B structures in determining centromeres. Our results highlight the uneven distribution of predicted non-B DNA structures across ape genomes and suggest their novel functions in previously inaccessible genomic regions.
- Publication type
- Journal Article MeSH
- Preprint MeSH
P53, P63, and P73 proteins belong to the P53 family of transcription factors, sharing a common gene organization that, from the P1 and P2 promoters, produces two groups of mRNAs encoding proteins with different N-terminal regions; moreover, alternative splicing events at C-terminus further contribute to the generation of multiple isoforms. P53 family proteins can influence a plethora of cellular pathways mainly through the direct binding to specific DNA sequences known as response elements (REs), and the transactivation of the corresponding target genes. However, the transcriptional activation by P53 family members can be regulated at multiple levels, including the DNA topology at responsive promoters. Here, by using a yeast-based functional assay, we evaluated the influence that a G-quadruplex (G4) prone sequence adjacent to the p53 RE derived from the apoptotic PUMA target gene can exert on the transactivation potential of full-length and N-terminal truncated P53 family α isoforms (wild-type and mutant). Our results show that the presence of a G4 prone sequence upstream or downstream of the P53 RE leads to significant changes in the relative activity of P53 family proteins, emphasizing the potential role of structural DNA features as modifiers of P53 family functions at target promoter sites.
- Keywords
- G-quadruplex (G4) prone sequence, P53 family, transactivation potential, wild-type and mutant P53/P63 proteins, yeast,
- MeSH
- Apoptosis genetics MeSH
- DNA genetics ultrastructure MeSH
- G-Quadruplexes * MeSH
- Nucleic Acid Conformation MeSH
- Humans MeSH
- Membrane Proteins genetics ultrastructure MeSH
- Tumor Suppressor Protein p53 genetics ultrastructure MeSH
- Promoter Regions, Genetic genetics MeSH
- Tumor Protein p73 genetics ultrastructure MeSH
- Apoptosis Regulatory Proteins genetics MeSH
- Proto-Oncogene Proteins genetics MeSH
- Response Elements genetics MeSH
- Saccharomyces cerevisiae genetics MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Names of Substances
- BBC3 protein, human MeSH Browser
- CKAP4 protein, human MeSH Browser
- DNA MeSH
- Membrane Proteins MeSH
- Tumor Suppressor Protein p53 MeSH
- Tumor Protein p73 MeSH
- Apoptosis Regulatory Proteins MeSH
- Proto-Oncogene Proteins MeSH
- TP73 protein, human MeSH Browser
BACKGROUND: Influenza viruses are dangerous pathogens. Seventy-Seven genomes of recently emerged genotype 4 reassortant Eurasian avian-like H1N1 virus (G4-EA-H1N1) are currently available. We investigated the presence and variation of potential G-quadruplex forming sequences (PQS), which can serve as targets for antiviral treatment. RESULTS: PQS were identified in all 77 genomes. The total number of PQS in G4-EA-H1N1 genomes was 571. Interestingly, the number of PQS per genome in individual close relative viruses varied from 4 to 12. PQS were not randomly distributed in the 8 segments of the G4-EA-H1N1 genome, the highest frequency of PQS being found in the NP segment (1.39 per 1000 nt), which is considered a potential target for antiviral therapy. In contrast, no PQS was found in the NS segment. Analyses of variability pointed the importance of some PQS; even if genome variation of influenza virus is extreme, the PQS with the highest G4Hunter score is the most conserved in all tested genomes. G-quadruplex formation in vitro was experimentally confirmed using spectroscopic methods. CONCLUSIONS: The results presented here hint several G-quadruplex-forming sequences in G4-EA-H1N1 genomes, that could provide good therapeutic targets.
- Keywords
- G-quadruplex, G4Hunter, Influenza virus,
- MeSH
- Influenza, Human * MeSH
- G-Quadruplexes * MeSH
- Genome, Viral MeSH
- Genotype MeSH
- Humans MeSH
- Reassortant Viruses genetics MeSH
- Influenza A Virus, H1N1 Subtype * genetics MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
The importance of unusual DNA structures in the regulation of basic cellular processes is an emerging field of research. Amongst local non-B DNA structures, G-quadruplexes (G4s) have gained in popularity during the last decade, and their presence and functional relevance at the DNA and RNA level has been demonstrated in a number of viral, bacterial, and eukaryotic genomes, including humans. Here, we performed the first systematic search of G4-forming sequences in all archaeal genomes available in the NCBI database. In this article, we investigate the presence and locations of G-quadruplex forming sequences using the G4Hunter algorithm. G-quadruplex-prone sequences were identified in all archaeal species, with highly significant differences in frequency, from 0.037 to 15.31 potential quadruplex sequences per kb. While G4 forming sequences were extremely abundant in Hadesarchaea archeon (strikingly, more than 50% of the Hadesarchaea archaeon isolate WYZ-LMO6 genome is a potential part of a G4-motif), they were very rare in the Parvarchaeota phylum. The presence of G-quadruplex forming sequences does not follow a random distribution with an over-representation in non-coding RNA, suggesting possible roles for ncRNA regulation. These data illustrate the unique and non-random localization of G-quadruplexes in Archaea.
- Keywords
- Archaea, G4-forming motif, genome analysis, sequence prediction, unusual nucleic acid structures,
- MeSH
- Archaea classification genetics metabolism MeSH
- Archaeal Proteins genetics metabolism MeSH
- Circular Dichroism MeSH
- DNA-Binding Proteins genetics metabolism MeSH
- DNA chemistry genetics metabolism MeSH
- Species Specificity MeSH
- Phylogeny MeSH
- G-Quadruplexes * MeSH
- Genome, Archaeal genetics MeSH
- Genomics methods MeSH
- Nucleic Acid Conformation MeSH
- RNA chemistry genetics metabolism MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Names of Substances
- Archaeal Proteins MeSH
- DNA-Binding Proteins MeSH
- DNA MeSH
- RNA MeSH