BACKGROUND: Angiostrongylus cantonensis (Ac), or the rat lungworm, is a major cause of eosinophilic meningitis. Humans are infected by ingesting the 3rd stage larvae from primary hosts, snails, and slugs, or paratenic hosts. The currently used molecular test is a qPCR assay targeting the ITS1 rDNA region (ITS1) of Ac. METHODS: In silico design of a more sensitive qPCR assay was performed based on tandem repeats predicted to be the most abundant by the RepeatExplorer algorithm. Genomic DNA (gDNA) of Ac were used to determine the analytical sensitivity and specificity of the best primer/probe combination. This assay was then applied to clinical and environmental samples. RESULTS: The limit of detection of the best performing assay, AcanR3990, was 1 fg (the DNA equivalent of 1/100 000 dilution of a single 3rd stage larvae). Out of 127 CDC archived CSF samples from varied geographic locations, the AcanR3990 qPCR detected the presence of Ac in 49/49 ITS1 confirmed angiostrongyliasis patients, along with 15/73 samples previously negative by ITS1 qPCR despite strong clinical suspicion for angiostrongyliasis. Intermediate hosts (gastropods) and an accidental host, a symptomatic horse, were also tested with similar improvement in detection observed. AcanR3990 qPCR did not cross-react in 5 CSF from patients with proven neurocysticercosis, toxocariasis, gnathostomiasis, and baylisascariasis. AcanR3990 qPCR failed to amplify genomic DNA from the other related Angiostrongylus species tested except for Angiostrongylus mackerrasae (Am), a neurotropic species limited to Australia that would be expected to present with a clinical syndrome indistinguishable from Ac. CONCLUSION: These results suggest AcanR3990 qPCR assay is highly sensitive and specific with potential wide applicability as a One Health detection method for Ac and Am.
Tandem repeats are important parts of eukaryotic genomes being crucial e.g., for centromere and telomere function and chromatin modulation. In Lepidoptera, knowledge of tandem repeats is very limited despite the growing number of sequenced genomes. Here we introduce seven new satellite DNAs (satDNAs), which more than doubles the number of currently known lepidopteran satDNAs. The satDNAs were identified in genomes of three species of Crambidae moths, namely Ostrinia nubilalis, Cydalima perspectalis, and Diatraea postlineella, using graph-based computational pipeline RepeatExplorer. These repeats varied in their abundance and showed high variability within and between species, although some degree of conservation was noted. The satDNAs showed a scattered distribution, often on both autosomes and sex chromosomes, with the exception of both satellites in D. postlineella, in which the satDNAs were located at a single autosomal locus. Three satDNAs were abundant on the W chromosomes of O. nubilalis and C. perspectalis, thus contributing to their differentiation from the Z chromosomes. To provide background for the in situ localization of the satDNAs, we performed a detailed cytogenetic analysis of the karyotypes of all three species. This comparative analysis revealed differences in chromosome number, number and location of rDNA clusters, and molecular differentiation of sex chromosomes.
- Publication type
- Journal Article MeSH
RepeatExplorer2 is a novel version of a computational pipeline that uses graph-based clustering of next-generation sequencing reads for characterization of repetitive DNA in eukaryotes. The clustering algorithm facilitates repeat identification in any genome by using relatively small quantities of short sequence reads, and additional tools within the pipeline perform automatic annotation and quantification of the identified repeats. The pipeline is integrated into the Galaxy platform, which provides a user-friendly web interface for script execution and documentation of the results. Compared to the original version of the pipeline, RepeatExplorer2 provides automated annotation of transposable elements, identification of tandem repeats and enhanced visualization of analysis results. Here, we present an overview of the RepeatExplorer2 workflow and provide procedures for its application to (i) de novo repeat identification in a single species, (ii) comparative repeat analysis in a set of species, (iii) development of satellite DNA probes for cytogenetic experiments and (iv) identification of centromeric repeats based on ChIP-seq data. Each procedure takes approximately 2 d to complete. RepeatExplorer2 is available at https://repeatexplorer-elixir.cerit-sc.cz .
- MeSH
- DNA Probes chemistry genetics MeSH
- DNA chemistry genetics MeSH
- Genomics methods MeSH
- Humans MeSH
- Repetitive Sequences, Nucleic Acid MeSH
- Sequence Analysis, DNA methods MeSH
- Cluster Analysis MeSH
- Software MeSH
- DNA Transposable Elements MeSH
- High-Throughput Nucleotide Sequencing methods MeSH
- Animals MeSH
- Check Tag
- Humans MeSH
- Animals MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
BACKGROUND: Cultivated grasses are an important source of food for domestic animals worldwide. Increased knowledge of their genomes can speed up the development of new cultivars with better quality and greater resistance to biotic and abiotic stresses. The most widely grown grasses are tetraploid ryegrass species (Lolium) and diploid and hexaploid fescue species (Festuca). In this work, we characterized repetitive DNA sequences and their contribution to genome size in five fescue and two ryegrass species as well as one fescue and two ryegrass cultivars. RESULTS: Partial genome sequences produced by Illumina sequencing technology were used for genome-wide comparative analyses with the RepeatExplorer pipeline. Retrotransposons were the most abundant repeat type in all seven grass species. The Athila element of the Ty3/gypsy family showed the most striking differences in copy number between fescues and ryegrasses. The sequence data enabled the assembly of the long terminal repeat (LTR) element Fesreba, which is highly enriched in centromeric and (peri)centromeric regions in all species. A combination of fluorescence in situ hybridization (FISH) with a probe specific to the Fesreba element and immunostaining with centromeric histone H3 (CENH3) antibody showed their co-localization and indicated a possible role of Fesreba in centromere function. CONCLUSIONS: Comparative repeatome analyses in a set of fescues and ryegrasses provided new insights into their genome organization and divergence, including the assembly of the LTR element Fesreba. A new LTR element Fesreba was identified and found in abundance in centromeric regions of the fescues and ryegrasses. It may play a role in the function of their centromeres.
BACKGROUND AND AIMS: Most crucifer species (Brassicaceae) have small nuclear genomes (mean 1C-value 617 Mb). The species with the largest genomes occur within the monophyletic Hesperis clade (Mandáková et al., Plant Physiology174: 2062-2071; also known as Clade E or Lineage III). Whereas most chromosome numbers in the clade are 6 or 7, monoploid genome sizes vary 16-fold (256-4264 Mb). To get an insight into genome size evolution in the Hesperis clade (~350 species in ~48 genera), we aimed to identify, quantify and localize in situ the repeats from which these genomes are built. We analysed nuclear repeatomes in seven species, covering the phylogenetic and genome size breadth of the clade, by low-pass whole-genome sequencing. METHODS: Genome size was estimated by flow cytometry. Genomic DNA was sequenced on an Illumina sequencer and DNA repeats were identified and quantified using RepeatExplorer; the most abundant repeats were localized on chromosomes by fluorescence in situ hybridization. To evaluate the feasibility of bacterial artificial chromosome (BAC)-based comparative chromosome painting in Hesperis-clade species, BACs of arabidopsis were used as painting probes. KEY RESULTS: Most biennial and perennial species of the Hesperis clade possess unusually large nuclear genomes due to the proliferation of long terminal repeat retrotransposons. The prevalent genome expansion was rarely, but repeatedly, counteracted by purging of transposable elements in ephemeral and annual species. CONCLUSIONS: The most common ancestor of the Hesperis clade has experienced genome upsizing due to transposable element amplification. Further genome size increases, dominating diversification of all Hesperis-clade tribes, contrast with the overall stability of chromosome numbers. In some subclades and species genome downsizing occurred, presumably as an adaptive transition to an annual life cycle. The amplification versus purging of transposable elements and tandem repeats impacted the chromosomal architecture of the Hesperis-clade species.
Satellite DNA (satDNA) is the most variable fraction of the eukaryotic genome. Related species share a common ancestral satDNA library and changing of any library component in a particular lineage results in interspecific differences. Although the general developmental trend is clear, our knowledge of the origin and dynamics of satDNAs is still fragmentary. Here, we explore whole genome shotgun Illumina reads using the RepeatExplorer (RE) pipeline to infer satDNA family life stories in the genomes of Chenopodium species. The seven diploids studied represent separate lineages and provide an example of a species complex typical for angiosperms. Application of the RE pipeline allowed by similarity searches a determination of the satDNA family with a basic monomer of ~40 bp and to trace its transformation from the reconstructed ancestral to the species-specific sequences. As a result, three types of satDNA family evolutionary development were distinguished: (i) concerted evolution with mutation and recombination events; (ii) concerted evolution with a trend toward increased complexity and length of the satellite monomer; and (iii) non-concerted evolution, with low levels of homogenization and multidirectional trends. The third type is an example of entire repeatome transformation, thus producing a novel set of satDNA families, and genomes showing non-concerted evolution are proposed as a significant source for genomic diversity.
- MeSH
- Chenopodium genetics MeSH
- Diploidy MeSH
- DNA, Plant genetics MeSH
- Species Specificity MeSH
- Phylogeny MeSH
- Genome, Plant MeSH
- Genome Components MeSH
- Evolution, Molecular MeSH
- DNA, Satellite genetics MeSH
- Sequence Analysis, DNA MeSH
- High-Throughput Nucleotide Sequencing MeSH
- Publication type
- Journal Article MeSH
Allopolyploidy has played an important role in the evolution of the flowering plants. Genome mergers are often accompanied by significant and rapid alterations of genome size and structure via chromosomal rearrangements and altered dynamics of tandem and dispersed repetitive DNA families. Recent developments in sequencing technologies and bioinformatic methods allow for a comprehensive investigation of the repetitive component of plant genomes. Interpretation of evolutionary dynamics following allopolyploidization requires both the knowledge of parentage and the age of origin of an allopolyploid. Whereas parentage is typically inferred from cytogenetic and phylogenetic data, age inference is hampered by the reticulate nature of the phylogenetic relationships. Treating subgenomes of allopolyploids as if they belonged to different species (i.e., no recombination among subgenomes) and applying cross-bracing (i.e., putting a constraint on the age difference of nodes pertaining to the same event), we can infer the age of allopolyploids within the framework of the multispecies coalescent within BEAST2. Together with a comprehensive characterization of the repetitive DNA fraction using the RepeatExplorer pipeline, we apply the dating approach in a group of closely related allopolyploids and their progenitor species in the plant genus Melampodium (Asteraceae). We dated the origin of both the allotetraploid, Melampodium strigosum, and its two allohexaploid derivatives, Melampodium pringlei and Melampodium sericeum, which share both parentage and the direction of the cross, to the Pleistocene ($<$1.4 Ma). Thus, Pleistocene climatic fluctuations may have triggered formation of allopolyploids possibly in short intervals, contributing to difficulties in inferring the precise temporal order of allopolyploid species divergence of M. sericeum and M. pringlei. The relatively recent origin of the allopolyploids likely played a role in the near-absence of major changes in the repetitive fraction of the polyploids' genomes. The repetitive elements most affected by the postpolyploidization changes represented retrotransposons of the Ty1-copia lineage Maximus and, to a lesser extent, also Athila elements of Ty3-gypsy family.
The characterization of unusual telomere sequence sheds light on patterns of telomere evolution, maintenance and function. Plant species from the closely related genera Cestrum, Vestia and Sessea (family Solanaceae) lack known plant telomeric sequences. Here we characterize the telomere of Cestrum elegans, work that was a challenge because of its large genome size and few chromosomes (1C 9.76 pg; n = 8). We developed an approach that combines BAL31 digestion, which digests DNA from the ends and chromosome breaks, with next-generation sequencing (NGS), to generate data analysed in RepeatExplorer, designed for de novo repeats identification and quantification. We identify an unique repeat motif (TTTTTTAGGG)n in C. elegans, occurring in ca. 30 400 copies per haploid genome, averaging ca. 1900 copies per telomere, and synthesized by telomerase. We demonstrate that the motif is synthesized by telomerase. The occurrence of an unusual eukaryote (TTTTTTAGGG)n telomeric motif in C. elegans represents a switch in motif from the 'typical' angiosperm telomere (TTTAGGG)n . That switch may have happened with the divergence of Cestrum, Sessea and Vestia. The shift in motif when it arose would have had profound effects on telomere activity. Thus our finding provides a unique handle to study how telomerase and telomeres responded to genetic change, studies that will shed more light on telomere function.
- MeSH
- Cestrum genetics MeSH
- Chromosomes, Plant genetics MeSH
- Telomere chemistry genetics MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
MOTIVATION: Repetitive DNA makes up large portions of plant and animal nuclear genomes, yet it remains the least-characterized genome component in most species studied so far. Although the recent availability of high-throughput sequencing data provides necessary resources for in-depth investigation of genomic repeats, its utility is hampered by the lack of specialized bioinformatics tools and appropriate computational resources that would enable large-scale repeat analysis to be run by biologically oriented researchers. RESULTS: Here we present RepeatExplorer, a collection of software tools for characterization of repetitive elements, which is accessible via web interface. A key component of the server is the computational pipeline using a graph-based sequence clustering algorithm to facilitate de novo repeat identification without the need for reference databases of known elements. Because the algorithm uses short sequences randomly sampled from the genome as input, it is ideal for analyzing next-generation sequence reads. Additional tools are provided to aid in classification of identified repeats, investigate phylogenetic relationships of retroelements and perform comparative analysis of repeat composition between multiple species. The server allows to analyze several million sequence reads, which typically results in identification of most high and medium copy repeats in higher plant genomes.
- MeSH
- Algorithms MeSH
- DNA chemistry MeSH
- Eukaryota genetics MeSH
- Phylogeny MeSH
- Genome MeSH
- Internet MeSH
- Repetitive Sequences, Nucleic Acid * MeSH
- Sequence Analysis, DNA * MeSH
- Cluster Analysis MeSH
- Software * MeSH
- High-Throughput Nucleotide Sequencing * MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH