centromeric tandem repeat
Dotaz
Zobrazit nápovědu
Amplification of monomer sequences into long contiguous arrays is the main feature distinguishing satellite DNA from other tandem repeats, yet it is also the main obstacle in its investigation because these arrays are in principle difficult to assemble. Here we explore an alternative, assembly-free approach that utilizes ultra-long Oxford Nanopore reads to infer the length distribution of satellite repeat arrays, their association with other repeats and the prevailing sequence periodicities. Using the satellite DNA-rich legume plant Lathyrus sativus as a model, we demonstrated this approach by analyzing 11 major satellite repeats using a set of nanopore reads ranging from 30 to over 200 kb in length and representing 0.73× genome coverage. We found surprising differences between the analyzed repeats because only two of them were predominantly organized in long arrays typical for satellite DNA. The remaining nine satellites were found to be derived from short tandem arrays located within LTR-retrotransposons that occasionally expanded in length. While the corresponding LTR-retrotransposons were dispersed across the genome, this array expansion occurred mainly in the primary constrictions of the L. sativus chromosomes, which suggests that these genome regions are favourable for satellite DNA accumulation.
- MeSH
- centromera MeSH
- chromozomy rostlin MeSH
- DNA rostlinná genetika MeSH
- frekvence genu * MeSH
- genom rostlinný MeSH
- heterochromatin MeSH
- Lathyrus genetika MeSH
- molekulární evoluce MeSH
- nanopóry * MeSH
- retroelementy * MeSH
- satelitní DNA * MeSH
- tandemové repetitivní sekvence * MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
Holocentric chromosomes lack a primary constriction, in contrast to monocentrics. They form kinetochores distributed along almost the entire poleward surface of the chromatids, to which spindle fibers attach. No centromere-specific DNA sequence has been found for any holocentric organism studied so far. It was proposed that centromeric repeats, typical for many monocentric species, could not occur in holocentrics, most likely because of differences in the centromere organization. Here we show that the holokinetic centromeres of the Cyperaceae Rhynchospora pubera are highly enriched by a centromeric histone H3 variant-interacting centromere-specific satellite family designated "Tyba" and by centromeric retrotransposons (i.e., CRRh) occurring as genome-wide interspersed arrays. Centromeric arrays vary in length from 3 to 16 kb and are intermingled with gene-coding sequences and transposable elements. We show that holocentromeres of metaphase chromosomes are composed of multiple centromeric units rather than possessing a diffuse organization, thus favoring the polycentric model. A cell-cycle-dependent shuffling of multiple centromeric units results in the formation of functional (poly)centromeres during mitosis. The genome-wide distribution of centromeric repeat arrays interspersing the euchromatin provides a previously unidentified type of centromeric chromatin organization among eukaryotes. Thus, different types of holocentromeres exist in different species, namely with and without centromeric repetitive sequences.
Centromere position may change despite conserved chromosomal collinearity. Centromere repositioning and evolutionary new centromeres (ENCs) were frequently encountered during vertebrate genome evolution but only rarely observed in plants. The largest crucifer tribe, Arabideae (∼550 species; Brassicaceae, the mustard family), diversified into several well-defined subclades in the virtual absence of chromosome number variation. Bacterial artificial chromosome-based comparative chromosome painting uncovered a constancy of genome structures among 10 analyzed genomes representing seven Arabideae subclades classified as four genera: Arabis, Aubrieta, Draba, and Pseudoturritis Interestingly, the intra-tribal diversification was marked by a high frequency of ENCs on five of the eight homoeologous chromosomes in the crown-group genera, but not in the most ancestral Pseudoturritis genome. From the 32 documented ENCs, at least 26 originated independently, including 4 ENCs recurrently formed at the same position in not closely related species. While chromosomal localization of ENCs does not reflect the phylogenetic position of the Arabideae subclades, centromere seeding was usually confined to long chromosome arms, transforming acrocentric chromosomes to (sub)metacentric chromosomes. Centromere repositioning is proposed as the key mechanism differentiating overall conserved homoeologous chromosomes across the crown-group Arabideae subclades. The evolutionary significance of centromere repositioning is discussed in the context of possible adaptive effects on recombination and epigenetic regulation of gene expression.
FISH is a useful method to identify individual chromosomes in a karyotype and to discover their structural changes accompanying genome evolution and speciation. DNA probes for FISH should be chromosome specific and/or exhibit specific patterns of distribution along each chromosome. Such probes are not available in many plants including meadow fescue (Festuca pratensis Huds.), an important forage grass species. In the present study, various DNA repeats identified in Illumina shotgun sequences specific to chromosome 4F of F. pratensis were used as probes for FISH to develop the molecular karyotype of meadow fescue and to reveal a long-range molecular organization of its chromosomes. Five tandem repeats produced specific patterns on individual chromosomes. Their use in combination with probes for rRNA genes enabled the establishment of the molecular karyotype of meadow fescue. Most of the mobile genetic elements were dispersed along all the chromosomes except for the DNA transposon CACTA, which was localized preferentially to telomeric and subtelomeric regions, and a putative LTR element, which was localized to (peri)centromeric regions. Cytogenetic mapping of the 5 tandem repeats in other accessions of meadow fescue showed a highly similar distribution and confirmed the versatility and robustness of these probes.
The importance of DNA structure in the regulation of basic cellular processes is an emerging field of research. Among local non-B DNA structures, inverted repeat (IR) sequences that form cruciforms and G-rich sequences that form G-quadruplexes (G4) are found in all prokaryotic and eukaryotic organisms and are targets for regulatory proteins. We analyzed IRs and G4 sequences in the genome of the most important biotechnology microorganism, S. cerevisiae. IR and G4-prone sequences are enriched in specific genomic locations and differ markedly between mitochondrial and nuclear DNA. While G4s are overrepresented in telomeres and regions surrounding tRNAs, IRs are most enriched in centromeres, rDNA, replication origins and surrounding tRNAs. Mitochondrial DNA is enriched in both IR and G4-prone sequences relative to the nuclear genome. This extensive analysis of local DNA structures adds to the emerging picture of their importance in genome maintenance, DNA replication and transcription of subsets of genes.
The oocyte (maternal) nucleolus is essential for early embryonic development and embryos originating from enucleolated oocytes arrest at the 2-cell stage. The reason for this is unclear. Surprisingly, RNA polymerase I activity in nucleolus-less mouse embryos, as manifested by pre-rRNA synthesis, and pre-rRNA processing are not affected, indicating an unusual role of the nucleolus. We report here that the maternal nucleolus is indispensable for the regulation of major and minor satellite repeats soon after fertilisation. During the first embryonic cell cycle, absence of the nucleolus causes a significant reduction in major and minor satellite DNA by 12% and 18%, respectively. The expression of satellite transcripts is also affected, being reduced by more than half. Moreover, extensive chromosome bridging of the major and minor satellite sequences was observed during the first mitosis. Finally, we show that the absence of the maternal nucleolus alters S-phase dynamics and causes abnormal deposition of the H3.3 histone chaperone DAXX in pronuclei of nucleolus-less zygotes.
- MeSH
- blastocysta cytologie metabolismus MeSH
- buněčné jadérko metabolismus MeSH
- centromera metabolismus MeSH
- embryo savčí cytologie metabolismus MeSH
- genetická transkripce MeSH
- genom genetika MeSH
- heterochromatin genetika MeSH
- messenger RNA genetika metabolismus MeSH
- mikrosatelitní repetice genetika MeSH
- minisatelitní repetice genetika MeSH
- myši MeSH
- oocyty cytologie metabolismus MeSH
- posttranskripční úpravy RNA genetika MeSH
- prekurzory RNA genetika MeSH
- rekombinace genetická genetika MeSH
- replikace DNA genetika MeSH
- restrukturace chromatinu genetika MeSH
- RNA ribozomální biosyntéza genetika MeSH
- S fáze genetika MeSH
- savčí chromozomy metabolismus MeSH
- zvířata MeSH
- Check Tag
- mužské pohlaví MeSH
- myši MeSH
- ženské pohlaví MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
Ever since the introduction of high-throughput sequencing following the human genome project, assembling short reads into a reference of sufficient quality posed a significant problem as a large portion of the human genome-estimated 50-69%-is repetitive. As a result, a sizable proportion of sequencing reads is multi-mapping, i.e., without a unique placement in the genome. The two key parameters for whether or not a read is multi-mapping are the read length and genome complexity. Long reads are now able to span difficult, heterochromatic regions, including full centromeres, and characterize chromosomes from "telomere to telomere". Moreover, identical reads or repeat arrays can be differentiated based on their epigenetic marks, such as methylation patterns, aiding in the assembly process. This is despite the fact that long reads still contain a modest percentage of sequencing errors, disorienting the aligners and assemblers both in accuracy and speed. Here, I review the proposed and implemented solutions to the repeat resolution and the multi-mapping read problem, as well as the downstream consequences of reference choice, repeat masking, and proper representation of sex chromosomes. I also consider the forthcoming challenges and solutions with regards to long reads, where we expect the shift from the problem of repeat localization within a single individual to the problem of repeat positioning within pangenomes.
- MeSH
- centromera chemie MeSH
- délka genomu MeSH
- genom lidský * MeSH
- lidé MeSH
- mapování chromozomů metody MeSH
- metylace DNA MeSH
- mikrosatelitní repetice * MeSH
- pohlavní chromozomy chemie MeSH
- telomery chemie MeSH
- výpočetní biologie metody MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- přehledy MeSH
Knowledge of the fascinating world of DNA repeats is continuously being enriched by newly identified elements and their hypothetical or well-established biological relevance. Genomic approaches can be used for comparative studies of major repeats in any group of genomes, regardless of their size and complexity. Such studies are particularly fruitful in large genomes, and useful mainly in crop plants where they provide a rich source of molecular markers or information on indispensable genomic components (e.g., telomeres, centromeres, or ribosomal RNA genes). Surprisingly, in Allium species, a comprehensive comparative study of repeats is lacking. Here we provide such a study of two economically important species, Allium cepa (onion), and A. sativum (garlic), and their distantly related A. ursinum (wild garlic). We present an overview and classification of major repeats in these species and have paid specific attention to sequence conservation and copy numbers of major representatives in each type of repeat, including retrotransposons, rDNA, or newly identified satellite sequences. Prevailing repeats in all three studied species belonged to Ty3/gypsy elements, however they significantly diverged and we did not detect them in common clusters in comparative analysis. Actually, only a low number of clusters was shared by all three species. Such conserved repeats were for example 5S and 45S rDNA genes and surprisingly a specific and quite rare Ty1/copia lineage. Species-specific long satellites were found mainly in A. cepa and A. sativum. We also show in situ localization of selected repeats that could potentially be applicable as chromosomal markers, e.g., in interspecific breeding.
- MeSH
- Allium klasifikace genetika MeSH
- chromozomy rostlin MeSH
- genom rostlinný * MeSH
- genomika * metody MeSH
- hybridizace in situ fluorescenční MeSH
- nukleotidové motivy MeSH
- retroelementy MeSH
- satelitní DNA MeSH
- tandemové repetitivní sekvence MeSH
- telomery MeSH
- výpočetní biologie metody MeSH
- Publikační typ
- časopisecké články MeSH
Linear chromosomes of eukaryotic organisms invariably possess centromeres and telomeres to ensure proper chromosome segregation during nuclear divisions and to protect the chromosome ends from deterioration and fusion, respectively. While centromeric sequences may differ between species, with arrays of tandemly repeated sequences and retrotransposons being the most abundant sequence types in plant centromeres, telomeric sequences are usually highly conserved among plants and other organisms. The genome size of the carnivorous genus Genlisea (Lentibulariaceae) is highly variable. Here we study evolutionary sequence plasticity of these chromosomal domains at an intrageneric level. We show that Genlisea nigrocaulis (1C = 86 Mbp; 2n = 40) and G. hispidula (1C = 1550 Mbp; 2n = 40) differ as to their DNA composition at centromeres and telomeres. G. nigrocaulis and its close relative G. pygmaea revealed mainly 161 bp tandem repeats, while G. hispidula and its close relative G. subglabra displayed a combination of four retroelements at centromeric positions. G. nigrocaulis and G. pygmaea chromosome ends are characterized by the Arabidopsis-type telomeric repeats (TTTAGGG); G. hispidula and G. subglabra instead revealed two intermingled sequence variants (TTCAGG and TTTCAGG). These differences in centromeric and, surprisingly, also in telomeric DNA sequences, uncovered between groups with on average a > 9-fold genome size difference, emphasize the fast genome evolution within this genus. Such intrageneric evolutionary alteration of telomeric repeats with cytosine in the guanine-rich strand, not yet known for plants, might impact the epigenetic telomere chromatin modification.
- MeSH
- biologická evoluce * MeSH
- časové faktory MeSH
- centromera genetika MeSH
- chromozomy rostlin genetika MeSH
- druhová specificita MeSH
- genetická variace MeSH
- genom rostlinný genetika fyziologie MeSH
- Magnoliopsida genetika fyziologie MeSH
- molekulární sekvence - údaje MeSH
- sekvence nukleotidů MeSH
- telomery genetika MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
BACKGROUND: Genome size evolution is a complex process influenced by polyploidization, satellite DNA accumulation, and expansion of retroelements. How this process could be affected by different reproductive strategies is still poorly understood. METHODOLOGY/PRINCIPAL FINDINGS: We analyzed differences in the number and distribution of major repetitive DNA elements in two closely related species, Silene latifolia and S. vulgaris. Both species are diploid and possess the same chromosome number (2n = 24), but differ in their genome size and mode of reproduction. The dioecious S. latifolia (1C = 2.70 pg DNA) possesses sex chromosomes and its genome is 2.5× larger than that of the gynodioecious S. vulgaris (1C = 1.13 pg DNA), which does not possess sex chromosomes. We discovered that the genome of S. latifolia is larger mainly due to the expansion of Ogre retrotransposons. Surprisingly, the centromeric STAR-C and TR1 tandem repeats were found to be more abundant in S. vulgaris, the species with the smaller genome. We further examined the distribution of major repetitive sequences in related species in the Caryophyllaceae family. The results of FISH (fluorescence in situ hybridization) on mitotic chromosomes with the Retand element indicate that large rearrangements occurred during the evolution of the Caryophyllaceae family. CONCLUSIONS/SIGNIFICANCE: Our data demonstrate that the evolution of genome size in the genus Silene is accompanied by the expansion of different repetitive elements with specific patterns in the dioecious species possessing the sex chromosomes.
- MeSH
- chromozomy rostlin MeSH
- délka genomu MeSH
- genetická variace MeSH
- genom rostlinný MeSH
- genomika MeSH
- hybridizace in situ fluorescenční MeSH
- hybridizace nukleových kyselin MeSH
- Magnoliopsida genetika MeSH
- mikrosatelitní repetice genetika MeSH
- modely genetické MeSH
- molekulární evoluce MeSH
- polyploidie MeSH
- repetitivní sekvence nukleových kyselin genetika MeSH
- rostlinné geny MeSH
- rostlinné proteiny genetika MeSH
- satelitní DNA genetika MeSH
- Silene klasifikace genetika MeSH
- výpočetní biologie metody MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH