Nejvíce citovaný článek - PubMed ID 22096552
Next generation sequencing-based analysis of repetitive DNA in the model dioecious [corrected] plant Silene latifolia
Telomeres are essential structures formed from satellite DNA repeats at the ends of chromosomes in most eukaryotes. Satellite DNA repeat sequences are useful markers for karyotyping, but have a more enigmatic role in the eukaryotic cell. Much work has been done to investigate the structure and arrangement of repetitive DNA elements in classical models with implications for species evolution. Still more is needed until there is a complete picture of the biological function of DNA satellite sequences, particularly when considering non-model organisms. Celebrating Gregor Mendel's anniversary by going to the roots, this review is designed to inspire and aid new research into telomeres and satellites with a particular focus on non-model organisms and accessible experimental and in silico methods that do not require specialized equipment or expensive materials. We describe how to identify telomere (and satellite) repeats giving many examples of published (and some unpublished) data from these techniques to illustrate the principles behind the experiments. We also present advice on how to perform and analyse such experiments, including details of common pitfalls. Our examples are a selection of recent developments and underexplored areas of research from the past. As a nod to Mendel's early work, we use many examples from plants and insects, especially as much recent work has expanded beyond the human and yeast models traditional in telomere research. We give a general introduction to the accepted knowledge of telomere and satellite systems and include references to specialized reviews for the interested reader.
- Klíčová slova
- FISH, NGS, TRAP, eukaryotic tree of life, interstitial telomere sequences, retroelements, satellite, subtelomere structure, telomerase RNA, telomere evolution,
- MeSH
- DNA MeSH
- lidé MeSH
- repetitivní sekvence nukleových kyselin MeSH
- satelitní DNA * MeSH
- sekvence nukleotidů MeSH
- telomery * genetika MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- přehledy MeSH
- Názvy látek
- DNA MeSH
- satelitní DNA * MeSH
Trifolium L. is an economically important genus that is characterized by variable karyotypes relating to its ploidy level and basic chromosome numbers. The advent of genomic resources combined with molecular cytogenetics provides an opportunity to develop our understanding of plant genomes in general. Here, we summarize the current state of knowledge on Trifolium genomes and chromosomes and review methodologies using molecular markers that have contributed to Trifolium research. We discuss possible future applications of cytogenetic methods in research on the Trifolium genome and chromosomes.
- Klíčová slova
- chromosomal markers, clover, cytogenetics, genome size, interspecific hybridization, polyploidy, synteny,
- Publikační typ
- časopisecké články MeSH
- přehledy MeSH
Plant genomes are highly diverse in size and repetitive DNA composition. In the absence of polyploidy, the dynamics of repetitive elements, which make up the bulk of the genome in many species, are the main drivers underpinning changes in genome size and the overall evolution of the genomic landscape. The advent of high-throughput sequencing technologies has enabled investigation of genome evolutionary dynamics beyond model plants to provide exciting new insights in species across the biodiversity of life. Here we analyze the evolution of repetitive DNA in two closely related species of Heloniopsis (Melanthiaceae), which despite having the same chromosome number differ nearly twofold in genome size [i.e., H. umbellata (1C = 4,680 Mb), and H. koreana (1C = 2,480 Mb)]. Low-coverage genome skimming and the RepeatExplorer2 pipeline were used to identify the main repeat families responsible for the significant differences in genome sizes. Patterns of repeat evolution were found to correlate with genome size with the main classes of transposable elements identified being twice as abundant in the larger genome of H. umbellata compared with H. koreana. In addition, among the satellite DNA families recovered, a single shared satellite (HeloSAT) was shown to have contributed significantly to the genome expansion of H. umbellata. Evolutionary changes in repetitive DNA composition and genome size indicate that the differences in genome size between these species have been underpinned by the activity of several distinct repeat lineages.
- Klíčová slova
- C-value, DNA repeats, chromosome, satellite DNA, transposable elements,
- Publikační typ
- časopisecké články MeSH
Molecular evolution of ribosomal DNA can be highly dynamic. Hundreds to thousands of copies in the genome are subject to concerted evolution, which homogenizes sequence variants to different degrees. If well homogenized, sequences are suitable for phylogeny reconstruction; if not, sequence polymorphism has to be handled appropriately. Here we investigate non-coding rDNA sequences (ITS/ETS, 5S-NTS) along with the chromosomal organization of their respective loci (45S and 5S rDNA) in diploids of the Hieraciinae. The subtribe consists of genera Hieracium, Pilosella, Andryala, and Hispidella and has a complex evolutionary history characterized by ancient intergeneric hybridization, allele sharing among species, and incomplete lineage sorting. Direct or cloned Sanger sequences and phased alleles derived from Illumina genome sequencing were subjected to phylogenetic analyses. Patterns of homogenization and tree topologies based on the three regions were compared. In contrast to most other plant groups, 5S-NTS sequences were generally better homogenized than ITS and ETS sequences. A novel case of ancient intergeneric hybridization between Hispidella and Hieracium was inferred, and some further incongruences between the trees were found, suggesting independent evolution of these regions. In some species, homogenization of ITS/ETS and 5S-NTS sequences proceeded in different directions although the 5S rDNA locus always occurred on the same chromosome with one 45S rDNA locus. The ancestral rDNA organization in the Hieraciinae comprised 4 loci of 45S rDNA in terminal positions and 2 loci of 5S rDNA in interstitial positions per diploid genome. In Hieracium, some deviations from this general pattern were found (3, 6, or 7 loci of 45S rDNA; three loci of 5S rDNA). Some of these deviations concerned intraspecific variation, and most of them occurred at the tips of the tree or independently in different lineages. This indicates that the organization of rDNA loci is more dynamic than the evolution of sequences contained in them and that locus number is therefore largely unsuitable to inform about species relationships in Hieracium. No consistent differences in the degree of sequence homogenization and the number of 45S rDNA loci were found, suggesting interlocus concerted evolution.
- Klíčová slova
- 45S rDNA, 5S rDNA, Andryala, Hieracium, Pilosella, concerted evolution, in situ hybridization, molecular phylogeny,
- Publikační typ
- časopisecké články MeSH
The repetitive content of the plant genome (repeatome) often represents its largest fraction and is frequently correlated with its size. Transposable elements (TEs), the main component of the repeatome, are an important driver in the genome diversification due to their fast-evolving nature. Hybridization and polyploidization events are hypothesized to induce massive bursts of TEs resulting, among other effects, in an increase of copy number and genome size. Little is known about the repeatome dynamics following hybridization and polyploidization in plants that reproduce by apomixis (asexual reproduction via seeds). To address this, we analyzed the repeatomes of two diploid parental species, Hieracium intybaceum and H. prenanthoides (sexual), their diploid F1 synthetic and their natural triploid hybrids (H. pallidiflorum and H. picroides, apomictic). Using low-coverage next-generation sequencing (NGS) and a graph-based clustering approach, we detected high overall similarity across all major repeatome categories between the parental species, despite their large phylogenetic distance. Medium and highly abundant repetitive elements comprise ∼70% of Hieracium genomes; most prevalent were Ty3/Gypsy chromovirus Tekay and Ty1/Copia Maximus-SIRE elements. No TE bursts were detected, neither in synthetic nor in natural hybrids, as TE abundance generally followed theoretical expectations based on parental genome dosage. Slight over- and under-representation of TE cluster abundances reflected individual differences in genome size. However, in comparative analyses, apomicts displayed an overabundance of pararetrovirus clusters not observed in synthetic hybrids. Substantial deviations were detected in rDNAs and satellite repeats, but these patterns were sample specific. rDNA and satellite repeats (three of them were newly developed as cytogenetic markers) were localized on chromosomes by fluorescence in situ hybridization (FISH). In a few cases, low-abundant repeats (5S rDNA and certain satellites) showed some discrepancy between NGS data and FISH results, which is due partly to the bias of low-coverage sequencing and partly to low amounts of the satellite repeats or their sequence divergence. Overall, satellite DNA (including rDNA) was markedly affected by hybridization, but independent of the ploidy or reproductive mode of the progeny, whereas bursts of TEs did not play an important role in the evolutionary history of Hieracium.
- Klíčová slova
- RepeatExplorer, apomixis, hawkweed, hybridization, next-generation sequencing, polyploidization, repeatome,
- Publikační typ
- časopisecké články MeSH
BACKGROUND: Cultivated grasses are an important source of food for domestic animals worldwide. Increased knowledge of their genomes can speed up the development of new cultivars with better quality and greater resistance to biotic and abiotic stresses. The most widely grown grasses are tetraploid ryegrass species (Lolium) and diploid and hexaploid fescue species (Festuca). In this work, we characterized repetitive DNA sequences and their contribution to genome size in five fescue and two ryegrass species as well as one fescue and two ryegrass cultivars. RESULTS: Partial genome sequences produced by Illumina sequencing technology were used for genome-wide comparative analyses with the RepeatExplorer pipeline. Retrotransposons were the most abundant repeat type in all seven grass species. The Athila element of the Ty3/gypsy family showed the most striking differences in copy number between fescues and ryegrasses. The sequence data enabled the assembly of the long terminal repeat (LTR) element Fesreba, which is highly enriched in centromeric and (peri)centromeric regions in all species. A combination of fluorescence in situ hybridization (FISH) with a probe specific to the Fesreba element and immunostaining with centromeric histone H3 (CENH3) antibody showed their co-localization and indicated a possible role of Fesreba in centromere function. CONCLUSIONS: Comparative repeatome analyses in a set of fescues and ryegrasses provided new insights into their genome organization and divergence, including the assembly of the LTR element Fesreba. A new LTR element Fesreba was identified and found in abundance in centromeric regions of the fescues and ryegrasses. It may play a role in the function of their centromeres.
- Klíčová slova
- Centromere organization, Festuca, Illumina sequencing, Lolium, Repetitive DNA,
- MeSH
- centromera genetika MeSH
- chromozomy rostlin * MeSH
- Festuca genetika MeSH
- genom rostlinný genetika MeSH
- jílek genetika MeSH
- repetitivní sekvence nukleových kyselin * MeSH
- Publikační typ
- časopisecké články MeSH
The unigeneric tribe Heliophileae encompassing more than 100 Heliophila species is morphologically the most diverse Brassicaceae lineage. The tribe is endemic to southern Africa, confined chiefly to the southwestern South Africa, home of two biodiversity hotspots (Cape Floristic Region and Succulent Karoo). The monospecific Chamira (C. circaeoides), the only crucifer species with persistent cotyledons, is traditionally retrieved as the closest relative of Heliophileae. Our transcriptome analysis revealed a whole-genome duplication (WGD) ∼26.15-29.20 million years ago, presumably preceding the Chamira/Heliophila split. The WGD was then followed by genome-wide diploidization, species radiations, and cladogenesis in Heliophila. The expanded phylogeny based on nuclear ribosomal DNA internal transcribed spacer (ITS) uncovered four major infrageneric clades (A-D) in Heliophila and corroborated the sister relationship between Chamira and Heliophila. Herein, we analyzed how the diploidization process impacted the evolution of repetitive sequences through low-coverage whole-genome sequencing of 15 Heliophila species, representing the four clades, and Chamira. Despite the firmly established infrageneric cladogenesis and different ecological life histories (four perennials vs. 11 annual species), repeatome analysis showed overall comparable evolution of genome sizes (288-484 Mb) and repeat content (25.04-38.90%) across Heliophila species and clades. Among Heliophila species, long terminal repeat (LTR) retrotransposons were the predominant components of the analyzed genomes (11.51-22.42%), whereas tandem repeats had lower abundances (1.03-12.10%). In Chamira, the tandem repeat content (17.92%, 16 diverse tandem repeats) equals the abundance of LTR retrotransposons (16.69%). Among the 108 tandem repeats identified in Heliophila, only 16 repeats were found to be shared among two or more species; no tandem repeats were shared by Chamira and Heliophila genomes. Six "relic" tandem repeats were shared between any two different Heliophila clades by a common descent. Four and six clade-specific repeats shared among clade A and C species, respectively, support the monophyly of these two clades. Three repeats shared by all clade A species corroborate the recent diversification of this clade revealed by plastome-based molecular dating. Phylogenetic analysis based on repeat sequence similarities separated the Heliophila species to three clades [A, C, and (B+D)], mirroring the post-polyploid cladogenesis in Heliophila inferred from rDNA ITS and plastome sequences.
- Klíčová slova
- Cape flora, Cruciferae, South Africa, plastome phylogeny, rDNA ITS, repeatome, repetitive DNA, whole-genome duplication (WGD),
- Publikační typ
- časopisecké články MeSH
BACKGROUND: Plant LTR-retrotransposons are classified into two superfamilies, Ty1/copia and Ty3/gypsy. They are further divided into an enormous number of families which are, due to the high diversity of their nucleotide sequences, usually specific to a single or a group of closely related species. Previous attempts to group these families into broader categories reflecting their phylogenetic relationships were limited either to analyzing a narrow range of plant species or to analyzing a small numbers of elements. Furthermore, there is no reference database that allows for similarity based classification of LTR-retrotransposons. RESULTS: We have assembled a database of retrotransposon encoded polyprotein domains sequences extracted from 5410 Ty1/copia elements and 8453 Ty3/gypsy elements sampled from 80 species representing major groups of green plants (Viridiplantae). Phylogenetic analysis of the three most conserved polyprotein domains (RT, RH and INT) led to dividing Ty1/copia and Ty3/gypsy retrotransposons into 16 and 14 lineages respectively. We also characterized various features of LTR-retrotransposon sequences including additional polyprotein domains, extra open reading frames and primer binding sites, and found that the occurrence and/or type of these features correlates with phylogenies inferred from the three protein domains. CONCLUSIONS: We have established an improved classification system applicable to LTR-retrotransposons from a wide range of plant species. This system reflects phylogenetic relationships as well as distinct sequence and structural features of the elements. A comprehensive database of retrotransposon protein domains (REXdb) that reflects this classification provides a reference for efficient and unified annotation of LTR-retrotransposons in plant genomes. Access to REXdb related tools is implemented in the RepeatExplorer web server (https://repeatexplorer-elixir.cerit-sc.cz/) or using a standalone version of REXdb that can be downloaded seaparately from RepeatExplorer web page (http://repeatexplorer.org/).
- Klíčová slova
- LTR-retrotransposons, Polyprotein domains, Primer binding site, RepeatExplorer, Transposable elements,
- Publikační typ
- časopisecké články MeSH
BACKGROUND: The rise and fall of the Y chromosome was demonstrated in animals but plants often possess the large evolutionarily young Y chromosome that is thought has expanded recently. Break-even points dividing expansion and shrinkage phase of plant Y chromosome evolution are still to be determined. To assess the size dynamics of the Y chromosome, we studied intraspecific genome size variation and genome composition of male and female individuals in a dioecious plant Silene latifolia, a well-established model for sex-chromosomes evolution. RESULTS: Our genome size data are the first to demonstrate that regardless of intraspecific genome size variation, Y chromosome has retained its size in S. latifolia. Bioinformatics study of genome composition showed that constancy of Y chromosome size was caused by Y chromosome DNA loss and the female-specific proliferation of recently active dominant retrotransposons. We show that several families of retrotransposons have contributed to genome size variation but not to Y chromosome size change. CONCLUSIONS: Our results suggest that the large Y chromosome of S. latifolia has slowed down or stopped its expansion. Female-specific proliferation of retrotransposons, enlarging the genome with exception of the Y chromosome, was probably caused by silencing of highly active retrotransposons in males and represents an adaptive mechanism to suppress degenerative processes in the haploid stage. Sex specific silencing of transposons might be widespread in plants but hidden in traditional hermaphroditic model plants.
- Klíčová slova
- Epigenetics, Genome size, Silene latifolia, Transposable elements, Y chromosome,
- MeSH
- chromozomy rostlin * MeSH
- délka genomu MeSH
- DNA rostlinná * MeSH
- genom rostlinný MeSH
- hybridizace in situ fluorescenční MeSH
- koncové repetice MeSH
- mapování chromozomů MeSH
- molekulární evoluce * MeSH
- repetitivní sekvence nukleových kyselin MeSH
- retroelementy * MeSH
- sekvenční delece * MeSH
- Silene klasifikace genetika MeSH
- umlčování genů * MeSH
- variabilita počtu kopií segmentů DNA MeSH
- zastoupení bazí MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- DNA rostlinná * MeSH
- retroelementy * MeSH
Satellite DNA is one of the major classes of repetitive DNA, characterized by tandemly arranged repeat copies that form contiguous arrays up to megabases in length. This type of genomic organization makes satellite DNA difficult to assemble, which hampers characterization of satellite sequences by computational analysis of genomic contigs. Here, we present tandem repeat analyzer (TAREAN), a novel computational pipeline that circumvents this problem by detecting satellite repeats directly from unassembled short reads. The pipeline first employs graph-based sequence clustering to identify groups of reads that represent repetitive elements. Putative satellite repeats are subsequently detected by the presence of circular structures in their cluster graphs. Consensus sequences of repeat monomers are then reconstructed from the most frequent k-mers obtained by decomposing read sequences from corresponding clusters. The pipeline performance was successfully validated by analyzing low-pass genome sequencing data from five plant species where satellite DNA was previously experimentally characterized. Moreover, novel satellite repeats were predicted for the genome of Vicia faba and three of these repeats were verified by detecting their sequences on metaphase chromosomes using fluorescence in situ hybridization.
- MeSH
- DNA rostlinná genetika MeSH
- genom rostlinný * MeSH
- hrách setý genetika MeSH
- hybridizace in situ fluorescenční MeSH
- konsenzuální sekvence MeSH
- kukuřice setá genetika MeSH
- Magnoliopsida genetika MeSH
- mapování chromozomů metody MeSH
- metafáze MeSH
- počítačová grafika MeSH
- šáchorovité genetika MeSH
- satelitní DNA klasifikace genetika MeSH
- sekvence nukleotidů MeSH
- sekvenční analýza DNA MeSH
- shluková analýza MeSH
- software * MeSH
- Vicia faba genetika MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- DNA rostlinná MeSH
- satelitní DNA MeSH