Most cited article - PubMed ID 18031571
Repetitive DNA in the pea (Pisum sativum L.) genome: comprehensive characterization using 454 sequencing and comparison to soybean and Medicago truncatula
Grasspea (Lathyrus sativus L.) is an underutilised but promising legume crop with tolerance to a wide range of abiotic and biotic stress factors, and potential for climate-resilient agriculture. Despite a long history and wide geographical distribution of cultivation, only limited breeding resources are available. This paper reports a 5.96 Gbp genome assembly of grasspea genotype LS007, of which 5.03 Gbp is scaffolded into 7 pseudo-chromosomes. The assembly has a BUSCO completeness score of 99.1% and is annotated with 31719 gene models and repeat elements. This represents the most contiguous and accurate assembly of the grasspea genome to date.
- MeSH
- Chromosomes, Plant * genetics MeSH
- Genome, Plant * MeSH
- Lathyrus * genetics MeSH
- Publication type
- Journal Article MeSH
- Dataset MeSH
Centromeres in the legume genera Pisum and Lathyrus exhibit unique morphological characteristics, including extended primary constrictions and multiple separate domains of centromeric chromatin. These so-called metapolycentromeres resemble an intermediate form between monocentric and holocentric types, and therefore provide a great opportunity for studying the transitions between different types of centromere organizations. However, because of the exceedingly large and highly repetitive nature of metapolycentromeres, highly contiguous assemblies needed for these studies are lacking. Here, we report on the assembly and analysis of a 177.6 Mb region of pea (Pisum sativum) chromosome 6, including the 81.6 Mb centromere region (CEN6) and adjacent chromosome arms. Genes, DNA methylation profiles, and most of the repeats were uniformly distributed within the centromere, and their densities in CEN6 and chromosome arms were similar. The exception was an accumulation of satellite DNA in CEN6, where it formed multiple arrays up to 2 Mb in length. Centromeric chromatin, characterized by the presence of the CENH3 protein, was predominantly associated with arrays of three different satellite repeats; however, five other satellites present in CEN6 lacked CENH3. The presence of CENH3 chromatin was found to determine the spatial distribution of the respective satellites during the cell cycle. Finally, oligo-FISH painting experiments, performed using probes specifically designed to label the genomic regions corresponding to CEN6 in Pisum, Lathyrus, and Vicia species, revealed that metapolycentromeres evolved via the expansion of centromeric chromatin into neighboring chromosomal regions and the accumulation of novel satellite repeats. However, in some of these species, centromere evolution also involved chromosomal translocations and centromere repositioning.
- MeSH
- Centromere genetics MeSH
- Chromatin genetics MeSH
- Pisum sativum * genetics MeSH
- Humans MeSH
- Chromosomes, Human, Pair 6 * MeSH
- DNA, Satellite genetics MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Names of Substances
- Chromatin MeSH
- DNA, Satellite MeSH
Telomeres are essential structures formed from satellite DNA repeats at the ends of chromosomes in most eukaryotes. Satellite DNA repeat sequences are useful markers for karyotyping, but have a more enigmatic role in the eukaryotic cell. Much work has been done to investigate the structure and arrangement of repetitive DNA elements in classical models with implications for species evolution. Still more is needed until there is a complete picture of the biological function of DNA satellite sequences, particularly when considering non-model organisms. Celebrating Gregor Mendel's anniversary by going to the roots, this review is designed to inspire and aid new research into telomeres and satellites with a particular focus on non-model organisms and accessible experimental and in silico methods that do not require specialized equipment or expensive materials. We describe how to identify telomere (and satellite) repeats giving many examples of published (and some unpublished) data from these techniques to illustrate the principles behind the experiments. We also present advice on how to perform and analyse such experiments, including details of common pitfalls. Our examples are a selection of recent developments and underexplored areas of research from the past. As a nod to Mendel's early work, we use many examples from plants and insects, especially as much recent work has expanded beyond the human and yeast models traditional in telomere research. We give a general introduction to the accepted knowledge of telomere and satellite systems and include references to specialized reviews for the interested reader.
- Keywords
- FISH, NGS, TRAP, eukaryotic tree of life, interstitial telomere sequences, retroelements, satellite, subtelomere structure, telomerase RNA, telomere evolution,
- MeSH
- DNA MeSH
- Humans MeSH
- Repetitive Sequences, Nucleic Acid MeSH
- DNA, Satellite * MeSH
- Base Sequence MeSH
- Telomere * genetics MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Review MeSH
- Names of Substances
- DNA MeSH
- DNA, Satellite * MeSH
Trifolium L. is an economically important genus that is characterized by variable karyotypes relating to its ploidy level and basic chromosome numbers. The advent of genomic resources combined with molecular cytogenetics provides an opportunity to develop our understanding of plant genomes in general. Here, we summarize the current state of knowledge on Trifolium genomes and chromosomes and review methodologies using molecular markers that have contributed to Trifolium research. We discuss possible future applications of cytogenetic methods in research on the Trifolium genome and chromosomes.
- Keywords
- chromosomal markers, clover, cytogenetics, genome size, interspecific hybridization, polyploidy, synteny,
- Publication type
- Journal Article MeSH
- Review MeSH
Repeat-rich regions of higher plant genomes are usually associated with constitutive heterochromatin, a specific type of chromatin that forms tightly packed nuclear chromocenters and chromosome bands. There is a large body of cytogenetic evidence that these chromosome regions are often composed of tandemly organized satellite DNA. However, comparatively little is known about the sequence arrangement within heterochromatic regions, which are difficult to assemble due to their repeated nature. Here, we explore long-range sequence organization of heterochromatin regions containing the major satellite repeat CUS-TR24 in the holocentric plant Cuscuta europaea. Using a combination of ultra-long read sequencing with assembly-free sequence analysis, we reveal the complex structure of these loci, which are composed of short arrays of CUS-TR24 interrupted frequently by emerging simple sequence repeats and targeted insertions of a specific lineage of LINE retrotransposons. These data suggest that the organization of satellite repeats constituting heterochromatic chromosome bands can be more complex than previously envisioned, and demonstrate that heterochromatin organization can be efficiently investigated without the need for genome assembly.
- Keywords
- Fluorescence in situ hybridization, Heterochromatin, Holocentric chromosomes, LINE elements, Oxford Nanopore sequencing, Satellite DNA,
- Publication type
- Journal Article MeSH
Satellite repeats are major sequence constituents of centromeres in many plant and animal species. Within a species, a single family of satellite sequences typically occupies centromeres of all chromosomes and is absent from other parts of the genome. Due to their common origin, sequence similarities exist among the centromere-specific satellites in related species. Here, we report a remarkably different pattern of centromere evolution in the plant tribe Fabeae, which includes genera Pisum, Lathyrus, Vicia, and Lens. By immunoprecipitation of centromeric chromatin with CENH3 antibodies, we identified and characterized a large and diverse set of 64 families of centromeric satellites in 14 species. These families differed in their nucleotide sequence, monomer length (33-2,979 bp), and abundance in individual species. Most families were species-specific, and most species possessed multiple (2-12) satellites in their centromeres. Some of the repeats that were shared by several species exhibited promiscuous patterns of centromere association, being located within CENH3 chromatin in some species, but apart from the centromeres in others. Moreover, FISH experiments revealed that the same family could assume centromeric and noncentromeric positions even within a single species. Taken together, these findings suggest that Fabeae centromeres are not shaped by the coevolution of a single centromeric satellite with its interacting CENH3 proteins, as proposed by the centromere drive model. This conclusion is also supported by the absence of pervasive adaptive evolution of CENH3 sequences retrieved from Fabeae species.
- Keywords
- CENH3, ChIP-seq, centromere evolution, plant chromosomes, satellite DNA,
- MeSH
- Centromere chemistry MeSH
- Species Specificity MeSH
- Fabaceae genetics MeSH
- Genetic Variation * MeSH
- DNA, Satellite chemistry MeSH
- Selection, Genetic MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Comparative Study MeSH
- Names of Substances
- DNA, Satellite MeSH
BACKGROUND: Cultivated grasses are an important source of food for domestic animals worldwide. Increased knowledge of their genomes can speed up the development of new cultivars with better quality and greater resistance to biotic and abiotic stresses. The most widely grown grasses are tetraploid ryegrass species (Lolium) and diploid and hexaploid fescue species (Festuca). In this work, we characterized repetitive DNA sequences and their contribution to genome size in five fescue and two ryegrass species as well as one fescue and two ryegrass cultivars. RESULTS: Partial genome sequences produced by Illumina sequencing technology were used for genome-wide comparative analyses with the RepeatExplorer pipeline. Retrotransposons were the most abundant repeat type in all seven grass species. The Athila element of the Ty3/gypsy family showed the most striking differences in copy number between fescues and ryegrasses. The sequence data enabled the assembly of the long terminal repeat (LTR) element Fesreba, which is highly enriched in centromeric and (peri)centromeric regions in all species. A combination of fluorescence in situ hybridization (FISH) with a probe specific to the Fesreba element and immunostaining with centromeric histone H3 (CENH3) antibody showed their co-localization and indicated a possible role of Fesreba in centromere function. CONCLUSIONS: Comparative repeatome analyses in a set of fescues and ryegrasses provided new insights into their genome organization and divergence, including the assembly of the LTR element Fesreba. A new LTR element Fesreba was identified and found in abundance in centromeric regions of the fescues and ryegrasses. It may play a role in the function of their centromeres.
Amplification of monomer sequences into long contiguous arrays is the main feature distinguishing satellite DNA from other tandem repeats, yet it is also the main obstacle in its investigation because these arrays are in principle difficult to assemble. Here we explore an alternative, assembly-free approach that utilizes ultra-long Oxford Nanopore reads to infer the length distribution of satellite repeat arrays, their association with other repeats and the prevailing sequence periodicities. Using the satellite DNA-rich legume plant Lathyrus sativus as a model, we demonstrated this approach by analyzing 11 major satellite repeats using a set of nanopore reads ranging from 30 to over 200 kb in length and representing 0.73× genome coverage. We found surprising differences between the analyzed repeats because only two of them were predominantly organized in long arrays typical for satellite DNA. The remaining nine satellites were found to be derived from short tandem arrays located within LTR-retrotransposons that occasionally expanded in length. While the corresponding LTR-retrotransposons were dispersed across the genome, this array expansion occurred mainly in the primary constrictions of the L. sativus chromosomes, which suggests that these genome regions are favourable for satellite DNA accumulation.
- Keywords
- Lathyrus sativus, centromeres, fluorescence in situ hybridization (FISH), heterochromatin, long-range organization, nanopore sequencing, satellite DNA, sequence evolution, technical advance,
- MeSH
- Centromere MeSH
- Chromosomes, Plant MeSH
- DNA, Plant genetics MeSH
- Gene Frequency * MeSH
- Genome, Plant MeSH
- Heterochromatin MeSH
- Lathyrus genetics MeSH
- Evolution, Molecular MeSH
- Nanopores * MeSH
- Retroelements * MeSH
- DNA, Satellite * MeSH
- Tandem Repeat Sequences * MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Names of Substances
- DNA, Plant MeSH
- Heterochromatin MeSH
- Retroelements * MeSH
- DNA, Satellite * MeSH
The centromere is the region on a chromosome where the kinetochore assembles and spindle microtubules attach during mitosis and meiosis. In the vast majority of eukaryotes, the centromere position is determined epigenetically by the presence of the centromere-specific histone H3 variant CENH3. In species with monocentric chromosomes, CENH3 is confined to a single chromosomal region corresponding to the primary constriction on metaphase chromosomes. By contrast, in holocentrics, CENH3 (and thus centromere activity) is distributed along the entire chromosome length. Here, we report a unique pattern of CENH3 distribution in the holocentric plant Cuscuta europaea. This species expressed two major variants of CENH3, both of which were deposited into one to three discrete regions per chromosome, whereas the rest of the chromatin appeared to be devoid of CENH3. The two CENH3 variants fully co-localized, and their immunodetection signals overlapped with the positions of DAPI-positive heterochromatic bands containing the highly amplified satellite repeat CUS-TR24. This CENH3 distribution pattern contrasted with the distribution of the mitotic spindle microtubules, which attached at uniform density along the entire chromosome length. This distribution of spindle attachment sites proves the holocentric nature of C. europaea chromosomes and also suggests that, in this species, CENH3 either lost its function or acts in parallel to an additional CENH3-free mechanism of kinetochore positioning.
- Keywords
- CENH3, Cuscuta, centromere, holocentric chromosomes, kinetochore, repetitive DNA analysis, satellite DNA,
- Publication type
- Journal Article MeSH
BACKGROUND: Plant LTR-retrotransposons are classified into two superfamilies, Ty1/copia and Ty3/gypsy. They are further divided into an enormous number of families which are, due to the high diversity of their nucleotide sequences, usually specific to a single or a group of closely related species. Previous attempts to group these families into broader categories reflecting their phylogenetic relationships were limited either to analyzing a narrow range of plant species or to analyzing a small numbers of elements. Furthermore, there is no reference database that allows for similarity based classification of LTR-retrotransposons. RESULTS: We have assembled a database of retrotransposon encoded polyprotein domains sequences extracted from 5410 Ty1/copia elements and 8453 Ty3/gypsy elements sampled from 80 species representing major groups of green plants (Viridiplantae). Phylogenetic analysis of the three most conserved polyprotein domains (RT, RH and INT) led to dividing Ty1/copia and Ty3/gypsy retrotransposons into 16 and 14 lineages respectively. We also characterized various features of LTR-retrotransposon sequences including additional polyprotein domains, extra open reading frames and primer binding sites, and found that the occurrence and/or type of these features correlates with phylogenies inferred from the three protein domains. CONCLUSIONS: We have established an improved classification system applicable to LTR-retrotransposons from a wide range of plant species. This system reflects phylogenetic relationships as well as distinct sequence and structural features of the elements. A comprehensive database of retrotransposon protein domains (REXdb) that reflects this classification provides a reference for efficient and unified annotation of LTR-retrotransposons in plant genomes. Access to REXdb related tools is implemented in the RepeatExplorer web server (https://repeatexplorer-elixir.cerit-sc.cz/) or using a standalone version of REXdb that can be downloaded seaparately from RepeatExplorer web page (http://repeatexplorer.org/).
- Keywords
- LTR-retrotransposons, Polyprotein domains, Primer binding site, RepeatExplorer, Transposable elements,
- Publication type
- Journal Article MeSH