Global analysis of repetitive DNA from unassembled sequence reads using RepeatExplorer2

. 2020 Nov ; 15 (11) : 3745-3776. [epub] 20201023

Jazyk angličtina Země Velká Británie, Anglie Médium print-electronic

Typ dokumentu časopisecké články, práce podpořená grantem

Perzistentní odkaz   https://www.medvik.cz/link/pmid33097925

Grantová podpora
LM2015047 Ministerstvo Školství, Mládeže a Tělovýchovy (Ministry of Education, Youth and Sports) - International
CZ.02.1.01/0.0/0.0/16_013/0001777 Ministerstvo Školství, Mládeže a Tělovýchovy (Ministry of Education, Youth and Sports) - International

Odkazy

PubMed 33097925
DOI 10.1038/s41596-020-0400-y
PII: 10.1038/s41596-020-0400-y
Knihovny.cz E-zdroje

RepeatExplorer2 is a novel version of a computational pipeline that uses graph-based clustering of next-generation sequencing reads for characterization of repetitive DNA in eukaryotes. The clustering algorithm facilitates repeat identification in any genome by using relatively small quantities of short sequence reads, and additional tools within the pipeline perform automatic annotation and quantification of the identified repeats. The pipeline is integrated into the Galaxy platform, which provides a user-friendly web interface for script execution and documentation of the results. Compared to the original version of the pipeline, RepeatExplorer2 provides automated annotation of transposable elements, identification of tandem repeats and enhanced visualization of analysis results. Here, we present an overview of the RepeatExplorer2 workflow and provide procedures for its application to (i) de novo repeat identification in a single species, (ii) comparative repeat analysis in a set of species, (iii) development of satellite DNA probes for cytogenetic experiments and (iv) identification of centromeric repeats based on ChIP-seq data. Each procedure takes approximately 2 d to complete. RepeatExplorer2 is available at https://repeatexplorer-elixir.cerit-sc.cz .

Zobrazit více v PubMed

Pellicer, J., Hidalgo, O., Dodsworth, S. & Leitch, I. J. Genome size diversity and its impact on the evolution of land plants. Genes (Basel) 9, 88 (2018). DOI

Vu, G. T. H. et al. Comparative genome analysis reveals divergent genome size evolution in a carnivorous plant genus. Plant Genome 8, 1–14 (2015). DOI

Schnable, P. S. et al. The B73 maize genome: complexity, diversity, and dynamics. Science 326, 1112–1115 (2009). DOI

Garrido-Ramos, M. A. Satellite DNA: an evolving topic. Genes (Basel) 8, 230 (2017). DOI

Bennetzen, J. L. & Wang, H. The contributions of transposable elements to the structure, function, and evolution of plant genomes. Annu. Rev. Plant Biol. 65, 505–530 (2014). DOI

Metzker, M. L. Sequencing technologies—the next generation. Nat. Rev. Genet. 11, 31–46 (2009). DOI

Goerner-Potvin, P. & Bourque, G. Computational tools to unmask transposable elements. Nat. Rev. Genet. 19, 688–704 (2018). DOI

Lower, S. S., McGurk, M. P., Clark, A. G. & Barbash, D. A. Satellite DNA evolution: old ideas, new approaches. Curr. Opin. Genet. Dev. 49, 70–78 (2018). DOI

Novák, P., Neumann, P. & Macas, J. Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data. BMC Bioinforma. 11, 378 (2010). DOI

Novák, P., Neumann, P., Pech, J., Steinhaisl, J. & Macas, J. RepeatExplorer: a Galaxy-based web server for genome-wide characterization of eukaryotic repetitive elements from next-generation sequence reads. Bioinformatics 29, 792–793 (2013). DOI

Weiss-Schneeweiss, H., Leitch, A. R., McCann, J., Jang, T.-S. & Macas, J. Employing next generation sequencing to explore the repeat landscape of the plant genome. In Next Generation Sequencing in Plant Systematics Vol. 158 (eds. Hörandl, E. & Appelhans, M.) 155–179 (Koeltz Scientific Books, 2015).

Macas, J., Neumann, P. & Navrátilová, A. Repetitive DNA in the pea (Pisum sativum L.) genome: comprehensive characterization using 454 sequencing and comparison to soybean and Medicago truncatula. BMC Genomics 8, 427 (2007). DOI

Pertea, G. et al. TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets. Bioinformatics 19, 651–652 (2003). DOI

Afgan, E. et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Res. 46, W537–W544 (2018). DOI

Neumann, P., Novák, P., Hoštáková, N. & Macas, J. Systematic survey of plant LTR-retrotransposons elucidates phylogenetic relationships of their polyprotein domains and provides a reference for element classification. Mob. DNA 10, 1 (2019). DOI

Novák, P. et al. TAREAN: a computational tool for identification and characterization of satellite DNA from unassembled short reads. Nucleic Acids Res 45, e111 (2017). DOI

Blondel, V. D., Guillaume, J. L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. 2008, P10008 (2008). DOI

Macas, J. et al. In depth characterization of repetitive DNA in 23 plant genomes reveals sources of genome size variation in the legume tribe Fabeae. PLoS ONE 10, e0143424 (2015). DOI

Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389–3402 (1997). DOI

Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2014). DOI

Zytnicki, M., Akhunov, E. & Quesneville, H. Tedna: a transposable element de novo assembler. Bioinformatics 30, 2656–2658 (2014). DOI

Goubert, C. et al. De novo assembly and annotation of the Asian tiger mosquito (Aedes albopictus) repeatome with dnaPipeTE from raw genomic reads and comparative analysis with the yellow fever mosquito (Aedes aegypti). Genome Biol. Evol. 7, 1192–1205 (2015). DOI

Koch, P., Platzer, M. & Downie, B. R. RepARK—de novo creation of repeat libraries from whole-genome NGS reads. Nucleic Acids Res. 42, e80 (2014). DOI

Chu, C., Nielsen, R. & Wu, Y. REPdenovo: inferring de novo repeat motifs from short sequence reads. PLoS ONE 11, e0150719 (2016). DOI

Kumke, K. et al. Plantago lagopus B chromosome is enriched in 5S rDNA-derived satellite DNA. Cytogenet. Genome Res. 148, 68–73 (2016). DOI

Grant, J. R., Pilotte, N. & Williams, S. A. A case for using genomics and a bioinformatics pipeline to develop sensitive and species-specific PCR-based diagnostics for soil-transmitted helminths. Front. Genet. 10, 883 (2019). DOI

Neumann, P. et al. Stretching the rules: monocentric chromosomes with multiple centromere domains. PLoS Genet 8, e1002777 (2012). DOI

Howley, P. M., Israel, M. A., Law, M. F. & Martin, M. A. A rapid method for detecting and mapping homology between heterologous DNAs. Evaluation of polyomavirus genomes. J. Biol. Chem. 254, 4876–4883 (1979). PubMed

Ávila Robledillo, L. et al. Extraordinary sequence diversity and promiscuity of centromeric satellites in the legume tribe Fabeae. Mol. Biol. Evol. 37, 2341–2356 (2020). DOI

Ávila Robledillo, L. et al. Satellite DNA in Vicia faba is characterized by remarkable diversity in its sequence composition, association with centromeres, and replication timing. Sci. Rep. 8, 5838 (2018). DOI

Nejnovějších 20 citací...

Zobrazit více v
Medvik | PubMed

Contrasting distributions and expression characteristics of transcribing repeats in Setaria viridis

. 2025 Mar ; 18 (1) : e20551.

The first insight into Acanthocephalus (Palaeacanthocephala) satellitome: species-specific satellites as potential cytogenetic markers

. 2025 Jan 23 ; 15 (1) : 2945. [epub] 20250123

Holocentric repeat landscapes: From micro-evolutionary patterns to macro-evolutionary associations with karyotype evolution

. 2024 Dec ; 33 (24) : e17100. [epub] 20230814

Repeat-based holocentromeres of the woodrush Luzula sylvatica reveal insights into the evolutionary transition to holocentricity

. 2024 Nov 05 ; 15 (1) : 9565. [epub] 20241105

Does time matter? Intraspecific diversity of ribosomal RNA genes in lineages of the allopolyploid model grass Brachypodium hybridum with different evolutionary ages

. 2024 Oct 18 ; 24 (1) : 981. [epub] 20241018

A chromosome-scale reference genome of grasspea (Lathyrus sativus)

. 2024 Sep 27 ; 11 (1) : 1035. [epub] 20240927

First insight into the genomes of the Pulmonaria officinalis group (Boraginaceae) provided by repeatome analysis and comparative karyotyping

. 2024 Sep 13 ; 24 (1) : 859. [epub] 20240913

Satellite DNAs and the evolution of the multiple X1X2Y sex chromosomes in the wolf fish Hoplias malabaricus (Teleostei; Characiformes)

. 2024 Sep 02 ; 14 (1) : 20402. [epub] 20240902

DANTE and DANTE_LTR: lineage-centric annotation pipelines for long terminal repeat retrotransposons in plant genomes

. 2024 Sep ; 6 (3) : lqae113. [epub] 20240829

Ancient hybridization and repetitive element proliferation in the evolutionary history of the monocot genus Amomum (Zingiberaceae)

. 2024 ; 15 () : 1324358. [epub] 20240419

Phased Assembly of Neo-Sex Chromosomes Reveals Extensive Y Degeneration and Rapid Genome Evolution in Rumex hastatulus

. 2024 Apr 02 ; 41 (4) : .

Fast satellite DNA evolution in Nothobranchius annual killifishes

. 2023 Nov 21 ; 31 (4) : 33. [epub] 20231121

Analysis of 5S rDNA Genomic Organization Through the RepeatExplorer2 Pipeline: A Simplified Protocol

Holocentromeres can consist of merely a few megabase-sized satellite arrays

. 2023 Jun 13 ; 14 (1) : 3502. [epub] 20230613

Disruption of the standard kinetochore in holocentric Cuscuta species

. 2023 May 23 ; 120 (21) : e2300877120. [epub] 20230516

Genomics and biochemical analyses reveal a metabolon key to β-L-ODAP biosynthesis in Lathyrus sativus

. 2023 Feb 16 ; 14 (1) : 876. [epub] 20230216

Telomerase RNA in Hymenoptera (Insecta) switched to plant/ciliate-like biogenesis

. 2023 Jan 11 ; 51 (1) : 420-433.

The ecology of palm genomes: repeat-associated genome size expansion is constrained by aridity

. 2022 Oct ; 236 (2) : 433-446. [epub] 20220707

Telomeres and Their Neighbors

. 2022 Sep 16 ; 13 (9) : . [epub] 20220916

Genome diploidization associates with cladogenesis, trait disparity, and plastid gene evolution

. 2022 Aug 29 ; 190 (1) : 403-420.

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...