Nejvíce citovaný článek - PubMed ID 14579131
Sequence subfamilies of satellite repeats related to rDNA intergenic spacer are differentially amplified on Vicia sativa chromosomes
Genes for major ribosomal RNAs (rDNA) are present in multiple copies mainly organized in tandem arrays. The number and position of rDNA loci can change dynamically and their repatterning is presumably driven by other repetitive sequences. We explored a peculiar rDNA organization in several representatives of Lepidoptera with either extremely large or numerous rDNA clusters. We combined molecular cytogenetics with analyses of second- and third-generation sequencing data to show that rDNA spreads as a transcription unit and reveal association between rDNA and various repeats. Furthermore, we performed comparative long read analyses among the species with derived rDNA distribution and moths with a single rDNA locus, which is considered ancestral. Our results suggest that satellite arrays, rather than mobile elements, facilitate homology-mediated spread of rDNA via either integration of extrachromosomal rDNA circles or ectopic recombination. The latter arguably better explains preferential spread of rDNA into terminal regions of lepidopteran chromosomes as efficiency of ectopic recombination depends on the proximity of homologous sequences to telomeres.
- Klíčová slova
- Lepidoptera, major ribosomal RNA genes, mobile elements, satellite,
- MeSH
- chromozomy MeSH
- můry * genetika MeSH
- repetitivní sekvence nukleových kyselin * MeSH
- ribozomální DNA genetika MeSH
- zvířata MeSH
- Check Tag
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- ribozomální DNA MeSH
The history of rDNA research started almost 90 years ago when the geneticist, Barbara McClintock observed that in interphase nuclei of maize the nucleolus was formed in association with a specific region normally located near the end of a chromosome, which she called the nucleolar organizer region (NOR). Cytologists in the twentieth century recognized the nucleolus as a common structure in all eukaryotic cells, using both light and electron microscopy and biochemical and genetic studies identified ribosomes as the subcellular sites of protein synthesis. In the mid- to late 1960s, the synthesis of nuclear-encoded rRNA was the only system in multicellular organisms where transcripts of known function could be isolated, and their synthesis and processing could be studied. Cytogenetic observations of NOR regions with altered structure in plant interspecific hybrids and detailed knowledge of structure and function of rDNA were prerequisites for studies of nucleolar dominance, epistatic interactions of rDNA loci, and epigenetic silencing. In this article, we focus on the early rDNA research in plants, performed mainly at the dawn of molecular biology in the 60 to 80-ties of the last century which presented a prequel to the modern genomic era. We discuss - from a personal view - the topics such as synthesis of rRNA precursor (35S pre-rRNA in plants), processing, and the organization of 35S and 5S rDNA. Cloning and sequencing led to the observation that the transcribed and processed regions of the rRNA genes vary enormously, even between populations and species, in comparison with the more conserved regions coding for the mature rRNAs. Epigenetic phenomena and the impact of hybridization and allopolyploidy on rDNA expression and homogenization are discussed. This historical view of scientific progress and achievements sets the scene for the other articles highlighting the immense progress in rDNA research published in this special issue of Frontiers in Plant Science on "Molecular organization, evolution, and function of ribosomal DNA."
- Klíčová slova
- epigenetics, hybridization, molecular evolution, nucleolar dominance, polyploidy, rDNA research history, rRNA precursor, rRNA processing,
- Publikační typ
- časopisecké články MeSH
- přehledy MeSH
Amplification of monomer sequences into long contiguous arrays is the main feature distinguishing satellite DNA from other tandem repeats, yet it is also the main obstacle in its investigation because these arrays are in principle difficult to assemble. Here we explore an alternative, assembly-free approach that utilizes ultra-long Oxford Nanopore reads to infer the length distribution of satellite repeat arrays, their association with other repeats and the prevailing sequence periodicities. Using the satellite DNA-rich legume plant Lathyrus sativus as a model, we demonstrated this approach by analyzing 11 major satellite repeats using a set of nanopore reads ranging from 30 to over 200 kb in length and representing 0.73× genome coverage. We found surprising differences between the analyzed repeats because only two of them were predominantly organized in long arrays typical for satellite DNA. The remaining nine satellites were found to be derived from short tandem arrays located within LTR-retrotransposons that occasionally expanded in length. While the corresponding LTR-retrotransposons were dispersed across the genome, this array expansion occurred mainly in the primary constrictions of the L. sativus chromosomes, which suggests that these genome regions are favourable for satellite DNA accumulation.
- Klíčová slova
- Lathyrus sativus, centromeres, fluorescence in situ hybridization (FISH), heterochromatin, long-range organization, nanopore sequencing, satellite DNA, sequence evolution, technical advance,
- MeSH
- centromera MeSH
- chromozomy rostlin MeSH
- DNA rostlinná genetika MeSH
- frekvence genu * MeSH
- genom rostlinný MeSH
- heterochromatin MeSH
- Lathyrus genetika MeSH
- molekulární evoluce MeSH
- nanopóry * MeSH
- retroelementy * MeSH
- satelitní DNA * MeSH
- tandemové repetitivní sekvence * MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- DNA rostlinná MeSH
- heterochromatin MeSH
- retroelementy * MeSH
- satelitní DNA * MeSH
Satellite DNA, a class of repetitive sequences forming long arrays of tandemly repeated units, represents substantial portions of many plant genomes yet remains poorly characterized due to various methodological obstacles. Here we show that the genome of the field bean (Vicia faba, 2n = 12), a long-established model for cytogenetic studies in plants, contains a diverse set of satellite repeats, most of which remained concealed until their present investigation. Using next-generation sequencing combined with novel bioinformatics tools, we reconstructed consensus sequences of 23 novel satellite repeats representing 0.008-2.700% of the genome and mapped their distribution on chromosomes. We found that in addition to typical satellites with monomers hundreds of nucleotides long, V. faba contains a large number of satellite repeats with unusually long monomers (687-2033 bp), which are predominantly localized in pericentromeric regions. Using chromatin immunoprecipitation with CenH3 antibody, we revealed an extraordinary diversity of centromeric satellites, consisting of seven repeats with chromosome-specific distribution. We also found that in spite of their different nucleotide sequences, all centromeric repeats are replicated during mid-S phase, while most other satellites are replicated in the first part of late S phase, followed by a single family of FokI repeats representing the latest replicating chromatin.
- MeSH
- anotace sekvence MeSH
- centromera metabolismus MeSH
- chromatinová imunoprecipitace MeSH
- DNA rostlinná genetika metabolismus MeSH
- genom rostlinný genetika MeSH
- mapování chromozomů metody MeSH
- molekulární evoluce MeSH
- načasování replikace DNA genetika MeSH
- satelitní DNA genetika MeSH
- sekvenční analýza DNA MeSH
- Vicia faba genetika metabolismus MeSH
- výpočetní biologie MeSH
- vysoce účinné nukleotidové sekvenování MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- DNA rostlinná MeSH
- satelitní DNA MeSH
Satellite DNA is one of the major classes of repetitive DNA, characterized by tandemly arranged repeat copies that form contiguous arrays up to megabases in length. This type of genomic organization makes satellite DNA difficult to assemble, which hampers characterization of satellite sequences by computational analysis of genomic contigs. Here, we present tandem repeat analyzer (TAREAN), a novel computational pipeline that circumvents this problem by detecting satellite repeats directly from unassembled short reads. The pipeline first employs graph-based sequence clustering to identify groups of reads that represent repetitive elements. Putative satellite repeats are subsequently detected by the presence of circular structures in their cluster graphs. Consensus sequences of repeat monomers are then reconstructed from the most frequent k-mers obtained by decomposing read sequences from corresponding clusters. The pipeline performance was successfully validated by analyzing low-pass genome sequencing data from five plant species where satellite DNA was previously experimentally characterized. Moreover, novel satellite repeats were predicted for the genome of Vicia faba and three of these repeats were verified by detecting their sequences on metaphase chromosomes using fluorescence in situ hybridization.
- MeSH
- DNA rostlinná genetika MeSH
- genom rostlinný * MeSH
- hrách setý genetika MeSH
- hybridizace in situ fluorescenční MeSH
- konsenzuální sekvence MeSH
- kukuřice setá genetika MeSH
- Magnoliopsida genetika MeSH
- mapování chromozomů metody MeSH
- metafáze MeSH
- počítačová grafika MeSH
- šáchorovité genetika MeSH
- satelitní DNA klasifikace genetika MeSH
- sekvence nukleotidů MeSH
- sekvenční analýza DNA MeSH
- shluková analýza MeSH
- software * MeSH
- Vicia faba genetika MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- DNA rostlinná MeSH
- satelitní DNA MeSH
The differential accumulation and elimination of repetitive DNA are key drivers of genome size variation in flowering plants, yet there have been few studies which have analysed how different types of repeats in related species contribute to genome size evolution within a phylogenetic context. This question is addressed here by conducting large-scale comparative analysis of repeats in 23 species from four genera of the monophyletic legume tribe Fabeae, representing a 7.6-fold variation in genome size. Phylogenetic analysis and genome size reconstruction revealed that this diversity arose from genome size expansions and contractions in different lineages during the evolution of Fabeae. Employing a combination of low-pass genome sequencing with novel bioinformatic approaches resulted in identification and quantification of repeats making up 55-83% of the investigated genomes. In turn, this enabled an analysis of how each major repeat type contributed to the genome size variation encountered. Differential accumulation of repetitive DNA was found to account for 85% of the genome size differences between the species, and most (57%) of this variation was found to be driven by a single lineage of Ty3/gypsy LTR-retrotransposons, the Ogre elements. Although the amounts of several other lineages of LTR-retrotransposons and the total amount of satellite DNA were also positively correlated with genome size, their contributions to genome size variation were much smaller (up to 6%). Repeat analysis within a phylogenetic framework also revealed profound differences in the extent of sequence conservation between different repeat types across Fabeae. In addition to these findings, the study has provided a proof of concept for the approach combining recent developments in sequencing and bioinformatics to perform comparative analyses of repetitive DNAs in a large number of non-model species without the need to assemble their genomes.
- MeSH
- délka genomu * MeSH
- Fabaceae klasifikace genetika MeSH
- fylogeneze MeSH
- genetická variace * MeSH
- genom rostlinný * MeSH
- genomika * metody MeSH
- koncové repetice MeSH
- molekulární evoluce MeSH
- repetitivní sekvence nukleových kyselin * MeSH
- reprodukovatelnost výsledků MeSH
- sekvenční analýza DNA MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
BACKGROUND: Tandemly arranged nuclear ribosomal DNA (rDNA), encoding 18S, 5.8S and 26S ribosomal RNA (rRNA), exhibit concerted evolution, a pattern thought to result from the homogenisation of rDNA arrays. However rDNA homogeneity at the single nucleotide polymorphism (SNP) level has not been detailed in organisms with more than a few hundred copies of the rDNA unit. Here we study rDNA complexity in species with arrays consisting of thousands of units. METHODS: We examined homogeneity of genic (18S) and non-coding internally transcribed spacer (ITS1) regions of rDNA using Roche 454 and/or Illumina platforms in four angiosperm species, Nicotiana sylvestris, N. tomentosiformis, N. otophora and N. kawakamii. We compared the data with Southern blot hybridisation revealing the structure of intergenic spacer (IGS) sequences and with the number and distribution of rDNA loci. RESULTS AND CONCLUSIONS: In all four species the intragenomic homogeneity of the 18S gene was high; a single ribotype makes up over 90% of the genes. However greater variation was observed in the ITS1 region, particularly in species with two or more rDNA loci, where >55% of rDNA units were a single ribotype, with the second most abundant variant accounted for >18% of units. IGS heterogeneity was high in all species. The increased number of ribotypes in ITS1 compared with 18S sequences may reflect rounds of incomplete homogenisation with strong selection for functional genic regions and relaxed selection on ITS1 variants. The relationship between the number of ITS1 ribotypes and the number of rDNA loci leads us to propose that rDNA evolution and complexity is influenced by locus number and/or amplification of orphaned rDNA units at new chromosomal locations.
- MeSH
- diploidie * MeSH
- DNA rostlinná genetika MeSH
- genetická variace genetika MeSH
- genetické lokusy genetika MeSH
- genová dávka genetika MeSH
- mezerníky ribozomální DNA genetika MeSH
- ribozomální DNA genetika MeSH
- rostlinné geny genetika MeSH
- sekvenční analýza DNA MeSH
- Southernův blotting MeSH
- tabák genetika MeSH
- vysoce účinné nukleotidové sekvenování * MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- DNA rostlinná MeSH
- mezerníky ribozomální DNA MeSH
- ribozomální DNA MeSH
BACKGROUND: Silene latifolia is a dioecious [corrected] plant with well distinguished X and Y chromosomes that is used as a model to study sex determination and sex chromosome evolution in plants. However, efficient utilization of this species has been hampered by the lack of large-scale sequencing resources and detailed analysis of its genome composition, especially with respect to repetitive DNA, which makes up the majority of the genome. METHODOLOGY/PRINCIPAL FINDINGS: We performed low-pass 454 sequencing followed by similarity-based clustering of 454 reads in order to identify and characterize sequences of all major groups of S. latifolia repeats. Illumina sequencing data from male and female genomes were also generated and employed to quantify the genomic proportions of individual repeat families. The majority of identified repeats belonged to LTR-retrotransposons, constituting about 50% of genomic DNA, with Ty3/gypsy elements being more frequent than Ty1/copia. While there were differences between the male and female genome in the abundance of several repeat families, their overall repeat composition was highly similar. Specific localization patterns on sex chromosomes were found for several satellite repeats using in situ hybridization with probes based on k-mer frequency analysis of Illumina sequencing data. CONCLUSIONS/SIGNIFICANCE: This study provides comprehensive information about the sequence composition and abundance of repeats representing over 60% of the S. latifolia genome. The results revealed generally low divergence in repeat composition between the sex chromosomes, which is consistent with their relatively recent origin. In addition, the study generated various data resources that are available for future exploration of the S. latifolia genome.
Satellite sequences of the VicTR-B family are specific for the genus Vicia (Leguminosae), but their abundance varies among the species, being the highest in Vicia sativa and Vicia grandiflora. In this study, we have sequenced multiple randomly cloned VicTR-B fragments from these two species and analyzed their sequence variability, periodicity, and chromosomal localization. We have found that V. sativa VicTR-B sequences are homogeneous with respect to their nucleotide sequences and periodicity (monomers of 38 bp), whereas V. grandiflora repeats are considerably more variable, occurring in at least four distinct sequence subfamilies. Although the periodicity of 38 bp was conserved in most of the V. grandiflora sequences, one of the subfamilies was composed of higher-order repeats of 186 bp, which originated from a pentamer of the basic repeated unit. Individual VicTR-B subfamilies were preferentially located in either intercalary or subtelomeric regions of chromosomes. Interestingly, two V. grandiflora subfamilies with the highest similarity to V. sativa VicTR-B sequences were located in intercalary heterochromatic bands, showing similar chromosomal distribution as the majority of VicTR-B repeats in V. sativa. The other two V. grandiflora subfamilies showing a considerable divergence from V. sativa sequences were found to be accumulated at subtelomeric regions of V. grandiflora chromosomes.
- MeSH
- chromozomy rostlin chemie MeSH
- DNA rostlinná analýza MeSH
- genetická variace MeSH
- hybridizace in situ fluorescenční MeSH
- konzervovaná sekvence * MeSH
- mapování chromozomů * MeSH
- molekulární sekvence - údaje MeSH
- satelitní DNA analýza MeSH
- sekvence nukleotidů MeSH
- vikev genetika MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- DNA rostlinná MeSH
- satelitní DNA MeSH
In this paper we describe a pair of novel Ty3/gypsy retrotransposons isolated from the dioecious plant Silene latifolia, consisting of a non-autonomous element Retand-1 (3.7 kb) and its autonomous partner Retand-2 (11.1 kb). These two elements have highly similar long terminal repeat (LTR) sequences but differ in the presence of the typical retroelement coding regions (gag-pol genes), most of which are missing in Retand-1. Moreover, Retand-2 contains two additional open reading frames in antisense orientation localized between the pol gene and right LTR. Retand transcripts were detected in all organs tested (leaves, flower buds and roots) which, together with the high sequence similarity of LTRs in individual elements, indicates their recent transpositional activity. The autonomous elements are similarly abundant (2,700 copies) as non-autonomous ones (2,100 copies) in S. latifolia genome. Retand elements are also present in other Silene species, mostly in subtelomeric heterochromatin regions of all chromosomes. The only exception is the subtelomere of the short arm of the Y chromosome in S. latifolia which is known to lack the terminal heterochromatin. An interesting feature of the Retand elements is the presence of a tandem repeat sequence, which is more amplified in the non-autonomous Retand-1.
- MeSH
- chromozomy rostlin genetika MeSH
- DNA rostlinná metabolismus MeSH
- genetická transkripce MeSH
- genom rostlinný genetika MeSH
- koncové repetice genetika MeSH
- molekulární sekvence - údaje MeSH
- rekombinantní proteiny genetika MeSH
- retroelementy genetika MeSH
- sekvence nukleotidů MeSH
- Silene genetika MeSH
- Southernův blotting MeSH
- tandemové repetitivní sekvence genetika MeSH
- telomery genetika MeSH
- transkripční faktory genetika MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- DNA rostlinná MeSH
- mdg4 protein (gypsy) MeSH Prohlížeč
- rekombinantní proteiny MeSH
- retroelementy MeSH
- transkripční faktory MeSH