DANTE and DANTE_LTR: lineage-centric annotation pipelines for long terminal repeat retrotransposons in plant genomes
Status PubMed-not-MEDLINE Jazyk angličtina Země Velká Británie, Anglie Médium electronic-ecollection
Typ dokumentu časopisecké články
PubMed
39211332
PubMed Central
PMC11358816
DOI
10.1093/nargab/lqae113
PII: lqae113
Knihovny.cz E-zdroje
- Publikační typ
- časopisecké články MeSH
Long terminal repeat (LTR) retrotransposons constitute a predominant class of repetitive DNA elements in most plant genomes. With the increasing number of sequenced plant genomes, there is an ongoing demand for computational tools facilitating efficient annotation and classification of LTR retrotransposons in plant genome assemblies. Herein, we introduce DANTE, a computational pipeline for Domain-based ANnotation of Transposable Elements, designed for sensitive detection of these elements via their conserved protein domain sequences. The identified protein domains are subsequently inputted into the DANTE_LTR pipeline to annotate complete element sequences by detecting their structural features, such as LTRs, in adjacent genomic regions. Leveraging domain sequences allows for precise classification of elements into phylogenetic lineages, offering a more granular annotation compared with coarser conventional superfamily-based classification methods. The efficiency and accuracy of this approach were evidenced via annotation of LTR retrotransposons in 93 plant genomes. Results were benchmarked against several established pipelines, showing that DANTE_LTR is capable of identifying significantly more intact LTR retrotransposons. DANTE and DANTE_LTR are provided as user-friendly Galaxy tools accessible via a public server (https://repeatexplorer-elixir.cerit-sc.cz), installable on local Galaxy instances from the Galaxy tool shed or executable from the command line.
Zobrazit více v PubMed
Baucom R.S., Estill J.C., Chaparro C., Upshaw N., Jogi A., Deragon J.-M., Westerman R.P., SanMiguel P.J., Bennetzen J.L.. Exceptional diversity, non-random distribution, and rapid evolution of retroelements in the B73 maize genome. PLoS Genet. 2009; 5:e1000732. PubMed PMC
Pellicer J., Hidalgo O., Dodsworth S., Leitch I.J.. Genome size diversity and its impact on the evolution of land plants. Genes. 2018; 9:88. PubMed PMC
Kelly L.J., Renny-Byfield S., Pellicer J., Macas J., Novák P., Neumann P., Lysak M.A., Day P.D., Berger M., Fay M.F.et al. .. Analysis of the giant genomes of Fritillaria (Liliaceae) indicates that a lack of DNA removal characterizes extreme expansions in genome size. New Phytol. 2015; 208:596–607. PubMed PMC
Novák P., Guignard M.S., Neumann P., Kelly L.J., Mlinarec J., Koblížková A., Dodsworth S., Kovařík A., Pellicer J., Wang W.et al. .. Repeat-sequence turnover shifts fundamentally in species with large genomes. Nat. Plants. 2020; 6:1325–1329. PubMed
Hirsch C.D., Springer N.M.. Transposable element influences on gene expression in plants. Biochim. Biophys. Acta BBA - Gene Regul. Mech. 2017; 1860:157–165. PubMed
Dubin M.J., Scheid O.M., Becker C.. Transposons: a blessing curse. Curr. Opin. Plant Biol. 2018; 42:23–29. PubMed
Klein S.J., O’Neill R.J. Transposable elements: genome innovation, chromosome diversity, and centromere conflict. Chromosome Res. 2018; 26:5–23. PubMed PMC
Neumann P., Navrátilová A., Koblížková A., Kejnovský E., Hřibová E., Hobza R., Widmer A., Doležel J., Macas J.. Plant centromeric retrotransposons: a structural and cytogenetic perspective. Mob. DNA. 2011; 2:4. PubMed PMC
McCarthy E.M., McDonald J.F.. LTR_STRUC: a novel search and identification program for LTR retrotransposons. Bioinformatics. 2003; 19:362–367. PubMed
Xu Z., Wang H.. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 2007; 35:W265–W268. PubMed PMC
Ellinghaus D., Kurtz S., Willhoeft U.. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinf. 2008; 9:18. PubMed PMC
Steinbiss S., Willhoeft U., Gremme G., Kurtz S.. Fine-grained annotation and classification of de novo predicted LTR retrotransposons. Nucleic Acids Res. 2009; 37:7002–7013. PubMed PMC
Valencia J.D., Girgis H.Z.. LtrDetector: a tool-suite for detecting long terminal repeat retrotransposons de-novo. Bmc Genomics [Electronic Resource]. 2019; 20:450. PubMed PMC
Rho M., Choi J.-H., Kim S., Lynch M., Tang H.. De novo identification of LTR retrotransposons in eukaryotic genomes. Bmc Genomics [Electronic Resource]. 2007; 8:90. PubMed PMC
Lerat E. Identifying repeats and transposable elements in sequenced genomes: how to find your way through the dense forest of programs. Heredity. 2010; 104:520–533. PubMed
Ou S., Jiang N.. LTR_retriever: a highly accurate and sensitive program for identification of long terminal repeat retrotransposons. Plant Physiol. 2018; 176:1410–1422. PubMed PMC
Drost H.-G. LTRpred: de novo annotation of intact retrotransposons. J. Open Source Softw. 2020; 5:2170.
Wicker T., Sabot F., Hua-Van A., Bennetzen J.L., Capy P., Chalhoub B., Flavell A., Leroy P., Morgante M., Panaud O.et al. .. A unified classification system for eukaryotic transposable elements. Nat. Rev. Genet. 2007; 8:973–982. PubMed
Neumann P., Novák P., Hoštáková N., Macas J.. Systematic survey of plant LTR-retrotransposons elucidates phylogenetic relationships of their polyprotein domains and provides a reference for element classification. Mob. DNA. 2019; 10:1. PubMed PMC
Novák P., Neumann P., Macas J.. Global analysis of repetitive DNA from unassembled sequence reads using RepeatExplorer2. Nat. Protoc. 2020; 15:3745–3776. PubMed
Quesneville H., Bergman C.M., Andrieu O., Autard D., Nouaud D., Ashburner M., Anxolabehere D.. Combined evidence annotation of transposable elements in genome sequences. PLoS Comput. Biol. 2005; 1:e22. PubMed PMC
Zhang R.-G., Li G.-Y., Wang X.-L., Dainat J., Wang Z.-X., Ou S., Ma Y.. TEsorter: an accurate and fast method to classify LTR-retrotransposons in plant genomes. Hortic. Res. 2022; 9:uhac017. PubMed PMC
Orozco-Arias S., Jaimes P.A., Candamil M.S., Jiménez-Varón C.F., Tabares-Soto R., Isaza G., Guyot R.. InpactorDB: a classified lineage-level plant LTR retrotransposon reference library for free-alignment methods based on machine learning. Genes. 2021; 12:190. PubMed PMC
Sheetlin S.L., Park Y., Frith M.C., Spouge J.L.. Frameshift alignment: statistics and post-genomic applications. Bioinformatics. 2014; 30:3575–3582. PubMed PMC
Ou S., Su W., Liao Y., Chougule K., Agda J.R.A., Hellinga A.J., Lugo C.S.B., Elliott T.A., Ware D., Peterson T.et al. .. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol. 2019; 20:275. PubMed PMC
Zhou S.-S., Yan X.-M., Zhang K.-F., Liu H., Xu J., Nie S., Jia K.-H., Jiao S.-Q., Zhao W., Zhao Y.-J.et al. .. A comprehensive annotation dataset of intact LTR retrotransposons of 300 plant genomes. Sci. Data. 2021; 8:174. PubMed PMC
Orozco-Arias S., Humberto Lopez-Murillo L., Candamil-Cortés M.S., Arias M., Jaimes P.A., Rossi Paschoal A., Tabares-Soto R., Isaza G., Guyot R.. Inpactor2: a software based on deep learning to identify and classify LTR-retrotransposons in plant genomes. Brief. Bioinform. 2022; 24:bbac511. PubMed PMC
Macas J., Ávila Robledillo L., Kreplak J., Novák P., Koblížková A., Vrbová I., Burstin J., Neumann P.. Assembly of the 81.6 Mb centromere of pea chromosome 6 elucidates the structure and evolution of metapolycentric chromosomes. PLoS Genet. 2023; 19:e1010633. PubMed PMC
Witte C.-P., Le Q.H., Bureau T., Kumar A.. Terminal-repeat retrotransposons in miniature (TRIM) are involved in restructuring plant genomes. Proc. Natl. Acad. Sci. U.S.A. 2001; 98:13778–13783. PubMed PMC