Non-canonical (non-B) DNA structures-e.g. bent DNA, hairpins, G-quadruplexes (G4s), Z-DNA, etc.-which form at certain sequence motifs (e.g. A-phased repeats, inverted repeats, etc.), have emerged as important regulators of cellular processes and drivers of genome evolution. Yet, they have been understudied due to their repetitive nature and potentially inaccurate sequences generated with short-read technologies. Here we comprehensively characterize such motifs in the long-read telomere-to-telomere (T2T) genomes of human, bonobo, chimpanzee, gorilla, Bornean orangutan, Sumatran orangutan, and siamang. Non-B DNA motifs are enriched at the genomic regions added to T2T assemblies and occupy 9%-15%, 9%-11%, and 12%-38% of autosomes and chromosomes X and Y, respectively. G4s and Z-DNA are enriched at promoters and enhancers, as well as at origins of replication. Repetitive sequences harbor more non-B DNA motifs than non-repetitive sequences, especially in the short arms of acrocentric chromosomes. Most centromeres and/or their flanking regions are enriched in at least one non-B DNA motif type, consistent with a potential role of non-B structures in determining centromeres. Our results highlight the uneven distribution of predicted non-B DNA structures across ape genomes and suggest their novel functions in previously inaccessible genomic regions.
- MeSH
- DNA * chemistry genetics MeSH
- G-Quadruplexes MeSH
- Genome, Human MeSH
- Genome * MeSH
- Hominidae * genetics MeSH
- Humans MeSH
- Nucleotide Motifs MeSH
- Pan troglodytes genetics MeSH
- Repetitive Sequences, Nucleic Acid MeSH
- Telomere * genetics MeSH
- Animals MeSH
- Check Tag
- Humans MeSH
- Animals MeSH
- Publication type
- Journal Article MeSH
Multiple sex chromosomes usually arise from chromosomal rearrangements which involve ancestral sex chromosomes. There is a fundamental condition to be met for their long-term fixation: the meiosis must function, leading to the stability of the emerged system, mainly concerning the segregation of the sex multivalent. Here, we sought to analyze the degree of differentiation and meiotic pairing properties in the selected fish multiple sex chromosome system present in the wolf-fish Hoplias malabaricus (HMA). This species complex encompasses seven known karyotype forms (karyomorphs) where the karyomorph C (HMA-C) exhibits a nascent XY sex chromosomes from which the multiple X1X2Y system evolved in karyomorph HMA-D via a Y-autosome fusion. We combined genomic and cytogenetic approaches to analyze the satellite DNA (satDNA) content in the genome of HMA-D karyomorph and to investigate its potential contribution to X1X2Y sex chromosome differentiation. We revealed 56 satDNA monomers of which the majority was AT-rich and with repeat units longer than 100 bp. Seven out of 18 satDNA families chosen for chromosomal mapping by fluorescence in situ hybridization (FISH) formed detectable accumulation in at least one of the three sex chromosomes (X1, X2 and neo-Y). Nine satDNA monomers showed only two hybridization signals limited to HMA-D autosomes, and the two remaining ones provided no visible FISH signals. Out of seven satDNAs located on the HMA-D sex chromosomes, five mapped also to XY chromosomes of HMA-C. We showed that after the autosome-Y fusion event, the neo-Y chromosome has not substantially accumulated or eliminated satDNA sequences except for minor changes in the centromere-proximal region. Finally, based on the obtained FISHpatterns, we speculate on the possible contribution of satDNA to sex trivalent pairing and segregation.
- MeSH
- Characiformes * genetics MeSH
- Y Chromosome genetics MeSH
- In Situ Hybridization, Fluorescence * MeSH
- Karyotype MeSH
- Meiosis genetics MeSH
- Evolution, Molecular MeSH
- Sex Chromosomes * genetics MeSH
- DNA, Satellite * genetics MeSH
- Animals MeSH
- Check Tag
- Male MeSH
- Female MeSH
- Animals MeSH
- Publication type
- Journal Article MeSH
Centromeres in the legume genera Pisum and Lathyrus exhibit unique morphological characteristics, including extended primary constrictions and multiple separate domains of centromeric chromatin. These so-called metapolycentromeres resemble an intermediate form between monocentric and holocentric types, and therefore provide a great opportunity for studying the transitions between different types of centromere organizations. However, because of the exceedingly large and highly repetitive nature of metapolycentromeres, highly contiguous assemblies needed for these studies are lacking. Here, we report on the assembly and analysis of a 177.6 Mb region of pea (Pisum sativum) chromosome 6, including the 81.6 Mb centromere region (CEN6) and adjacent chromosome arms. Genes, DNA methylation profiles, and most of the repeats were uniformly distributed within the centromere, and their densities in CEN6 and chromosome arms were similar. The exception was an accumulation of satellite DNA in CEN6, where it formed multiple arrays up to 2 Mb in length. Centromeric chromatin, characterized by the presence of the CENH3 protein, was predominantly associated with arrays of three different satellite repeats; however, five other satellites present in CEN6 lacked CENH3. The presence of CENH3 chromatin was found to determine the spatial distribution of the respective satellites during the cell cycle. Finally, oligo-FISH painting experiments, performed using probes specifically designed to label the genomic regions corresponding to CEN6 in Pisum, Lathyrus, and Vicia species, revealed that metapolycentromeres evolved via the expansion of centromeric chromatin into neighboring chromosomal regions and the accumulation of novel satellite repeats. However, in some of these species, centromere evolution also involved chromosomal translocations and centromere repositioning.
- MeSH
- Centromere genetics MeSH
- Chromatin genetics MeSH
- Pisum sativum * genetics MeSH
- Humans MeSH
- Chromosomes, Human, Pair 6 * MeSH
- DNA, Satellite genetics MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
Taking advantage of evolving and improving sequencing methods, human chromosome 8 is now available as a gapless, end-to-end assembly. Thanks to advances in long-read sequencing technologies, its centromere, telomeres, duplicated gene families and repeat-rich regions are now fully sequenced. We were interested to assess if the new assembly altered our understanding of the potential impact of non-B DNA structures within this completed chromosome sequence. It has been shown that non-B secondary structures, such as G-quadruplexes, hairpins and cruciforms, have important regulatory functions and potential as targeted therapeutics. Therefore, we analysed the presence of putative G-quadruplex forming sequences and inverted repeats in the current human reference genome (GRCh38) and in the new end-to-end assembly of chromosome 8. The comparison revealed that the new assembly contains significantly more inverted repeats and G-quadruplex forming sequences compared to the current reference sequence. This observation can be explained by improved accuracy of the new sequencing methods, particularly in regions that contain extensive repeats of bases, as is preferred by many non-B DNA structures. These results show a significant underestimation of the prevalence of non-B DNA secondary structure in previous assembly versions of the human genome and point to their importance being not fully appreciated. We anticipate that similar observations will occur as the improved sequencing technologies fill in gaps across the genomes of humans and other organisms.
- MeSH
- G-Quadruplexes * MeSH
- Genome, Human MeSH
- Sequence Inversion * MeSH
- Humans MeSH
- Chromosomes, Human, Pair 8 * MeSH
- Sequence Analysis, DNA MeSH
- Telomere * MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
Satellite DNAs are present on every chromosome in the cell and are typically enriched in repetitive, heterochromatic parts of the human genome. Sex chromosomes represent a unique genomic and epigenetic context. In this review, we first report what is known about satellite DNA biology on human X and Y chromosomes, including repeat content and organization, as well as satellite variation in typical euploid individuals. Then, we review sex chromosome aneuploidies that are among the most common types of aneuploidies in the general population, and are better tolerated than autosomal aneuploidies. This is demonstrated also by the fact that aging is associated with the loss of the X, and especially the Y chromosome. In addition, supernumerary sex chromosomes enable us to study general processes in a cell, such as analyzing heterochromatin dosage (i.e. additional Barr bodies and long heterochromatin arrays on Yq) and their downstream consequences. Finally, genomic and epigenetic organization and regulation of satellite DNA could influence chromosome stability and lead to aneuploidy. In this review, we argue that the complete annotation of satellite DNA on sex chromosomes in human, and especially in centromeric regions, will aid in explaining the prevalence and the consequences of sex chromosome aneuploidies.
- MeSH
- Aneuploidy MeSH
- Centromere genetics MeSH
- Heterochromatin * genetics MeSH
- Humans MeSH
- Chromosomes, Human MeSH
- Sex Chromosomes genetics MeSH
- DNA, Satellite * genetics MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Review MeSH
Analysis of histone variants and epigenetic marks is dominated by genome-wide approaches in the form of chromatin immunoprecipitation-sequencing (ChIP-seq) and related methods. Although uncontested in their value for single-copy genes, mapping the chromatin of DNA repeats is problematic for biochemical techniques that involve averaging of cell populations or analysis of clusters of tandem repeats in a single-cell analysis. Extending chromatin and DNA fibers allows us to study the epigenetics of individual repeats in their specific chromosomal context, and thus constitutes an important tool for gaining a complete understanding of the epigenetic organization of genomes. We report that using an optimized fiber extension protocol is essential in order to obtain more reproducible data and to minimize the clustering of fibers. We also demonstrate that the use of super-resolution microscopy is important for reliable evaluation of the distribution of histone modifications on individual fibers. Furthermore, we introduce a custom script for the analysis of methylation levels on DNA fibers and apply it to map the methylation of telomeres, ribosomal genes and centromeres.
Methylation systems have been conserved during the divergence of plants and animals, although they are regulated by different pathways and enzymes. However, studies on the interactions of the epigenomes among evolutionarily distant organisms are lacking. To address this, we studied the epigenetic modification and gene expression of plant chromosome fragments (~30 Mb) in a human-Arabidopsis hybrid cell line. The whole-genome bisulfite sequencing results demonstrated that recombinant Arabidopsis DNA could retain its plant CG methylation levels even without functional plant methyltransferases, indicating that plant DNA methylation states can be maintained even in a different genomic background. The differential methylation analysis showed that the Arabidopsis DNA was undermethylated in the centromeric region and repetitive elements. Several Arabidopsis genes were still expressed, whereas the expression patterns were not related to the gene function. We concluded that the plant DNA did not maintain the original plant epigenomic landscapes and was under the control of the human genome. This study showed how two diverging genomes can coexist and provided insights into epigenetic modifications and their impact on the regulation of gene expressions between plant and animal genomes.
- MeSH
- Arabidopsis genetics MeSH
- Cell Line MeSH
- Chromosomes, Plant genetics MeSH
- DNA, Plant genetics MeSH
- Epigenesis, Genetic genetics MeSH
- Epigenome genetics MeSH
- Epigenomics methods MeSH
- Genome, Plant genetics MeSH
- Hybrid Cells physiology MeSH
- Humans MeSH
- Methyltransferases genetics MeSH
- DNA Methylation genetics MeSH
- Repetitive Sequences, Nucleic Acid genetics MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
Tandem repeats are important parts of eukaryotic genomes being crucial e.g., for centromere and telomere function and chromatin modulation. In Lepidoptera, knowledge of tandem repeats is very limited despite the growing number of sequenced genomes. Here we introduce seven new satellite DNAs (satDNAs), which more than doubles the number of currently known lepidopteran satDNAs. The satDNAs were identified in genomes of three species of Crambidae moths, namely Ostrinia nubilalis, Cydalima perspectalis, and Diatraea postlineella, using graph-based computational pipeline RepeatExplorer. These repeats varied in their abundance and showed high variability within and between species, although some degree of conservation was noted. The satDNAs showed a scattered distribution, often on both autosomes and sex chromosomes, with the exception of both satellites in D. postlineella, in which the satDNAs were located at a single autosomal locus. Three satDNAs were abundant on the W chromosomes of O. nubilalis and C. perspectalis, thus contributing to their differentiation from the Z chromosomes. To provide background for the in situ localization of the satDNAs, we performed a detailed cytogenetic analysis of the karyotypes of all three species. This comparative analysis revealed differences in chromosome number, number and location of rDNA clusters, and molecular differentiation of sex chromosomes.
- Publication type
- Journal Article MeSH
During homologous recombination, Dbl2 protein is required for localisation of Fbh1, an F-box helicase that efficiently dismantles Rad51-DNA filaments. RNA-seq analysis of dbl2Δ transcriptome showed that the dbl2 deletion results in upregulation of more than 500 loci in Schizosaccharomyces pombe. Compared with the loci with no change in expression, the misregulated loci in dbl2Δ are closer to long terminal and long tandem repeats. Furthermore, the misregulated loci overlap with antisense transcripts, retrotransposons, meiotic genes and genes located in subtelomeric regions. A comparison of the expression profiles revealed that Dbl2 represses the same type of genes as the HIRA histone chaperone complex. Although dbl2 deletion does not alleviate centromeric or telomeric silencing, it suppresses the silencing defect at the outer centromere caused by deletion of hip1 and slm9 genes encoding subunits of the HIRA complex. Moreover, our analyses revealed that cells lacking dbl2 show a slight increase of nucleosomes at transcription start sites and increased levels of methylated histone H3 (H3K9me2) at centromeres, subtelomeres, rDNA regions and long terminal repeats. Finally, we show that other proteins involved in homologous recombination, such as Fbh1, Rad51, Mus81 and Rad54, participate in the same gene repression pathway.
- MeSH
- Centromere MeSH
- Histone Code MeSH
- Homologous Recombination * MeSH
- Nucleosomes metabolism MeSH
- Cell Cycle Proteins antagonists & inhibitors metabolism MeSH
- Gene Expression Regulation, Fungal * MeSH
- Repressor Proteins physiology MeSH
- Schizosaccharomyces pombe Proteins antagonists & inhibitors metabolism physiology MeSH
- Schizosaccharomyces genetics MeSH
- Transcription Factors antagonists & inhibitors metabolism MeSH
- Gene Silencing * MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
Ever since the introduction of high-throughput sequencing following the human genome project, assembling short reads into a reference of sufficient quality posed a significant problem as a large portion of the human genome-estimated 50-69%-is repetitive. As a result, a sizable proportion of sequencing reads is multi-mapping, i.e., without a unique placement in the genome. The two key parameters for whether or not a read is multi-mapping are the read length and genome complexity. Long reads are now able to span difficult, heterochromatic regions, including full centromeres, and characterize chromosomes from "telomere to telomere". Moreover, identical reads or repeat arrays can be differentiated based on their epigenetic marks, such as methylation patterns, aiding in the assembly process. This is despite the fact that long reads still contain a modest percentage of sequencing errors, disorienting the aligners and assemblers both in accuracy and speed. Here, I review the proposed and implemented solutions to the repeat resolution and the multi-mapping read problem, as well as the downstream consequences of reference choice, repeat masking, and proper representation of sex chromosomes. I also consider the forthcoming challenges and solutions with regards to long reads, where we expect the shift from the problem of repeat localization within a single individual to the problem of repeat positioning within pangenomes.
- MeSH
- Centromere chemistry MeSH
- Genome Size MeSH
- Genome, Human * MeSH
- Humans MeSH
- Chromosome Mapping methods MeSH
- DNA Methylation MeSH
- Microsatellite Repeats * MeSH
- Sex Chromosomes chemistry MeSH
- Telomere chemistry MeSH
- Computational Biology methods MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Review MeSH