Evolution of Tandem Repeats Is Mirroring Post-polyploid Cladogenesis in Heliophila (Brassicaceae)
Status PubMed-not-MEDLINE Jazyk angličtina Země Švýcarsko Médium electronic-ecollection
Typ dokumentu časopisecké články
PubMed
33510751
PubMed Central
PMC7835680
DOI
10.3389/fpls.2020.607893
Knihovny.cz E-zdroje
- Klíčová slova
- Cape flora, Cruciferae, South Africa, plastome phylogeny, rDNA ITS, repeatome, repetitive DNA, whole-genome duplication (WGD),
- Publikační typ
- časopisecké články MeSH
The unigeneric tribe Heliophileae encompassing more than 100 Heliophila species is morphologically the most diverse Brassicaceae lineage. The tribe is endemic to southern Africa, confined chiefly to the southwestern South Africa, home of two biodiversity hotspots (Cape Floristic Region and Succulent Karoo). The monospecific Chamira (C. circaeoides), the only crucifer species with persistent cotyledons, is traditionally retrieved as the closest relative of Heliophileae. Our transcriptome analysis revealed a whole-genome duplication (WGD) ∼26.15-29.20 million years ago, presumably preceding the Chamira/Heliophila split. The WGD was then followed by genome-wide diploidization, species radiations, and cladogenesis in Heliophila. The expanded phylogeny based on nuclear ribosomal DNA internal transcribed spacer (ITS) uncovered four major infrageneric clades (A-D) in Heliophila and corroborated the sister relationship between Chamira and Heliophila. Herein, we analyzed how the diploidization process impacted the evolution of repetitive sequences through low-coverage whole-genome sequencing of 15 Heliophila species, representing the four clades, and Chamira. Despite the firmly established infrageneric cladogenesis and different ecological life histories (four perennials vs. 11 annual species), repeatome analysis showed overall comparable evolution of genome sizes (288-484 Mb) and repeat content (25.04-38.90%) across Heliophila species and clades. Among Heliophila species, long terminal repeat (LTR) retrotransposons were the predominant components of the analyzed genomes (11.51-22.42%), whereas tandem repeats had lower abundances (1.03-12.10%). In Chamira, the tandem repeat content (17.92%, 16 diverse tandem repeats) equals the abundance of LTR retrotransposons (16.69%). Among the 108 tandem repeats identified in Heliophila, only 16 repeats were found to be shared among two or more species; no tandem repeats were shared by Chamira and Heliophila genomes. Six "relic" tandem repeats were shared between any two different Heliophila clades by a common descent. Four and six clade-specific repeats shared among clade A and C species, respectively, support the monophyly of these two clades. Three repeats shared by all clade A species corroborate the recent diversification of this clade revealed by plastome-based molecular dating. Phylogenetic analysis based on repeat sequence similarities separated the Heliophila species to three clades [A, C, and (B+D)], mirroring the post-polyploid cladogenesis in Heliophila inferred from rDNA ITS and plastome sequences.
CEITEC Masaryk University Brno Czechia
Department of Biology Botany Osnabrück University Osnabrück Germany
Department of Experimental Biology Faculty of Science Masaryk University Brno Czechia
Department of Geography and Environmental Studies Stellenbosch University Stellenbosch South Africa
Harry Butler Institute Murdoch University Perth WA Australia
Institute of Botany Czech Academy of Sciences Prùhonice Czechia
Missouri Botanical Garden St Louis MO United States
NCBR Faculty of Science Masaryk University Brno Czechia
South African National Biodiversity Institute Kirstenbosch Cape Town South Africa
Zobrazit více v PubMed
Al-Shehbaz I. A. (2012). A generic and tribal synopsis of the Brassicaceae (Cruciferae). Taxon 61 931–954. 10.1002/tax.615002 DOI
Altschul S. F., Gish W., Miller W., Myers E. W., Lipman D. J. (1990). Basic local alignment search tool. J. Mol. Biol. 215 403–410. PubMed
Andrews S. (2010). FastQC: A Quality Control Tool for High Throughput Sequence Data. Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/
Benson G. (1998). “An algorithm for finding tandem repeats of unspecified pattern size,” in Proceedings of the Second Annual International Conference on Computational Molecular Biology, (New York, NY: ACM; ), 20–29. 10.1145/279069.279079 DOI
Bolger A. M., Lohse M., Usadel B. (2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30 2114–2120. 10.1093/bioinformatics/btu170 PubMed DOI PMC
Bolsheva N. L., Melnikova N. V., Kirov I. V., Dmitriev A. A., Krasnov G. S., Amosova ÀV., et al. (2019). Characterization of repeated DNA sequences in genomes of blue-flowered flax. BMC Evol. Biol. 19:49. 10.1186/s12862-019-1375-6 PubMed DOI PMC
Brown J. W., Walker J. F., Smith S. A. (2017). Phyx: phylogenetic tools for unix. Bioinformatics 33 1886–1888. 10.1093/bioinformatics/btx063 PubMed DOI PMC
Cechova M., Harris R. S., Tomaszkiewicz M., Arbeithuber B., Chiaromonte F., Makova K. D. (2019). High satellite repeat turnover in great apes studied with short-and long-read technologies. Mol. Biol. Evol. 36 2415–2431. 10.1093/molbev/msz156 PubMed DOI PMC
Davidson N. M., Oshlack A. (2014). Corset: enabling differential gene expression analysis for de novo assembled transcriptomes. Genome Biol. 15 1–14. PubMed PMC
Dierckxsens N., Mardulyn P., Smits G. (2016). NOVOPlasty: de novo assembly of organelle genomes from whole genome data. Nucleic Acids Res. 45:e18. PubMed PMC
Dodsworth S., Chase M. W., Kelly L. J., Leitch I. J., Macas J., Novák P., et al. (2014). Genomic repeat abundances contain phylogenetic signal. Syst. Biol. 64 112–126. 10.1093/sysbio/syu080 PubMed DOI PMC
Dodsworth S., Chase M. W., Särkinen T., Knapp S., Leitch A. R. (2016). Using genomic repeats for phylogenomics: a case study in wild tomatoes (Solanum section Lycopersicon: Solanaceae). Biol. J. Linn. Soc. 117 96–105. 10.1111/bij.12612 DOI
Dodsworth S., Jang T.-S., Struebig M., Chase M. W., Weiss-Schneeweiss H., Leitch A. R. (2017). Genome-wide repeat dynamics reflect phylogenetic distance in closely related allotetraploid Nicotiana (Solanaceae). Plant Syst. Evol. 303 1013–1020. 10.1007/s00606-016-1356-9 PubMed DOI PMC
Doležel J., Greilhuber J., Suda J. (2007). Estimation of nuclear DNA content in plants using flow cytometry. Nat. Protoc. 2:2233. 10.1038/nprot.2007.310 PubMed DOI
Doronina L., Churakov G., Kuritzin A., Shi J., Baertsch R., Clawson H., et al. (2017). Speciation network in Laurasiatheria: retrophylogenomic signals. Genome Res. 27 997–1003. 10.1101/gr.210948.116 PubMed DOI PMC
Emms D. M., Kelly S. (2015). OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 16:157. PubMed PMC
Franzke A., Koch M. A., Mummenhoff K. (2016). Turnip time travels: age estimates in Brassicaceae. Trends Plant Sci. 21 554–561. 10.1016/j.tplants.2016.01.024 PubMed DOI
Fu L., Niu B., Zhu Z., Wu S., Li W. (2012). CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28 3150–3152. 10.1093/bioinformatics/bts565 PubMed DOI PMC
García-Robledo C., Erickson D. L., Staines C. L., Erwin T. L., Kress W. J. (2013). Tropical plant–herbivore networks: reconstructing species interactions using DNA barcodes. PLoS One 8:e52967. 10.1371/journal.pone.0052967 PubMed DOI PMC
Garrido-Ramos M. A. (2015). Satellite DNA in plants: more than just rubbish. Cytogenet. Genome Res. 146 153–170. 10.1159/000437008 PubMed DOI
Garrido-Ramos M. A. (2017). Satellite DNA: an evolving topic. Genes 8:230. 10.3390/genes8090230 PubMed DOI PMC
Guo X., Liu J., Hao G., Zhang L., Mao K., Wang X., et al. (2017). Plastome phylogeny and early diversification of Brassicaceae. BMC Genomics 18:176. 10.1186/s12864-017-3555-3 PubMed DOI PMC
Haas B., Papanicolaou A. (2016). TransDecoder (Find Coding Regions Within Transcripts). Available online at: https://sourceforge.net/projects/transdecoder/
Haas B. J., Papanicolaou A., Yassour M., Grabherr M., Blood P. D., Bowden J., et al. (2013). De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8 1494–1512. 10.1038/nprot.2013.084 PubMed DOI PMC
Harkess A., Mercati F., Abbate L., McKain M., Pires J. C., Sala T., et al. (2016). Retrotransposon proliferation coincident with the evolution of dioecy in Asparagus. G3 6 2679–2685. 10.1534/g3.116.030239 PubMed DOI PMC
Henikoff S., Ahmad K., Malik H. S. (2001). The centromere paradox: stable inheritance with rapidly evolving DNA. Science 293 1098–1102. 10.1126/science.1062939 PubMed DOI
Hohmann N., Wolf E. M., Lysak M. A., Koch M. A. (2015). A time-calibrated road map of Brassicaceae species radiation and evolutionary history. Plant Cell 27 2770–2784. PubMed PMC
Huang D. I., Cronk Q. C. B. (2015). Plann: a command-line application for annotating plastome sequences. Appl. Plant Sci. 3:1500026. 10.3732/apps.1500026 PubMed DOI PMC
Huson D. H., Bryant D. (2006). Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23 254–267. 10.1093/molbev/msj030 PubMed DOI
Jurka J., Bao W., Kojima K. K. (2011). Families of transposable elements, population structure and the origin of species. Biol. Direct 6:44. 10.1186/1745-6150-6-44 PubMed DOI PMC
Kagale S., Robinson S. J., Nixon J., Xiao R., Huebert T., Condie J., et al. (2014). Polyploid evolution of the Brassicaceae during the Cenozoic era. Plant Cell 26 2777–2791. 10.1105/tpc.114.126391 PubMed DOI PMC
Kalyaanamoorthy S., Minh B. Q., Wong T. K. F., von Haeseler A., Jermiin L. S. (2017). ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Methods 14 587–589. 10.1038/nmeth.4285 PubMed DOI PMC
Katoh K., Standley D. M. (2013). MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30 772–780. 10.1093/molbev/mst010 PubMed DOI PMC
Kiefer C., Willing E.-M., Jiao W.-B., Sun H., Piednoël M., Hümann U., et al. (2019). Interspecies association mapping links reduced CG to TG substitution rates to the loss of gene-body methylation. Nat. Plants 5 846–855. 10.1038/s41477-019-0486-9 PubMed DOI
Kohany O., Gentles A. J., Hankus L., Jurka J. (2006). Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor. BMC Bioinform. 7:474. 10.1186/1471-2105-7-474 PubMed DOI PMC
Kolde R., Kolde M. R. (2015). Package ‘Pheatmap.’ R Package 1, 790. Available online at: https://www.rdocumentation.org/packages/pheatmap/versions/1.0.12/topics/pheatmap
Koukalova B., Moraes A. P., Renny-Byfield S., Matyasek R., Leitch A. R., Kovarik A. (2010). Fall and rise of satellite repeats in allopolyploids of Nicotiana over c. 5 million years. New Phytol. 186 148–160. 10.1111/j.1469-8137.2009.03101.x PubMed DOI
Kumwenda M. W. (2003). A Palynological Study of Heliophila (Brassicaceae) in Southern Africa. M. Sc. Thesis, University of Stellenbosch, Stellenbosch.
Lanfear R., Frandsen P. B., Wright A. M., Senfeld T., Calcott B. (2016). PartitionFinder 2: new methods for selecting partitioned models of evolution for molecular and morphological phylogenetic analyses. Mol. Biol. Evol. 34 772–773. PubMed
Langmead B., Salzberg S. L. (2012). Fast gapped-read alignment with Bowtie 2. Nat. Meth. 9:357. 10.1038/nmeth.1923 PubMed DOI PMC
Lysak M. A., Koch M. A. (2011). “Phylogeny, genome, and karyotype evolution of crucifers (Brassicaceae),” in Genetics and Genomics of the Brassicaceae, eds Schmidt R., Bancroft I. (New York, NY: Springer; ), 1–31. 10.1007/978-1-4419-7118-0_1 DOI
Macas J., Kejnovský E., Neumann P., Novák P., Koblížková A., Vyskot B. (2011). Next generation sequencing-based analysis of repetitive DNA in the model dioceous plant Silene latifolia. PLoS One 6:e27335. 10.1371/journal.pone.0027335 PubMed DOI PMC
Mandáková T., Mummenhoff K., Al-Shehbaz I. A., Mucina L., Mühlhausen A., Lysak M. A. (2012). Whole-genome triplication and species radiation in the southern African tribe Heliophileae (Brassicaceae). Taxon 61 989–1000. 10.1002/tax.615006 DOI
Mandáková T., Winter P., Al-Shehbaz I. A., Mucina L., Mummenhoff K., Lysak M. A., et al. (2015). “Brassicaceae. IAPT/IOPB chromosome data 19,” in Taxon, Vol. 64, ed. Marhold K. (Hoboken, NJ: Wiley; ), 1068–1074.
Mandáková T., Lysak M. A. (2016a). Chromosome preparation for cytogenetic analyses in Arabidopsis. Curr. Protoc. Plant Biol. 1 43–51. 10.1002/cppb.20009 PubMed DOI
Mandáková T., Lysak M. A. (2016b). Painting of Arabidopsis chromosomes with chromosome-specific BAC clones. Curr. Protoc. Plant Biol. 1 359–371. 10.1002/cppb.20022 PubMed DOI
Mandáková T., Li Z., Barker M. S., Lysak M. A. (2017). Diverse genome organization following 13 independent mesopolyploid events in Brassicaceae contrasts with convergent patterns of gene retention. Plant J. 91 3–21. 10.1111/tpj.13553 PubMed DOI
Marais W. (1970). “Cruciferae,” in Flora of Southern Africa, Vol. 13, eds Codd L. E., De Winter B., Killick D. J., Rycroft H. B. (Pretoria: Government Printer; ), 1–118.
McCann J., Macas J., Novák P., Stuessy T. F., Villaseñor J. L., Weiss-Schneeweiss H. (2020). Differential genome size and repetitive DNA evolution in diploid species of Melampodium sect. Melampodium (Asteraceae). Front. Plant Sci. 11:362. 10.3389/fpls.2020.00362 PubMed DOI PMC
Melters D. P., Bradnam K. R., Young H. A., Telis N., May M. R., Ruby J. G., et al. (2013). Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution. Genome Biol. 14:R10. 10.1186/gb-2013-14-1-r10 PubMed DOI PMC
Meraldi P., McAinsh A. D., Rheinbay E., Sorger P. K. (2006). Phylogenetic and structural analysis of centromeric DNA and kinetochore proteins. Genome Biol. 7:R23. 10.1186/gb-2006-7-3-r23 PubMed DOI PMC
Miller M. A., Pfeiffer W., Schwartz T. (2010). “Creating the CIPRES science gateway for inference of large phylogenetic trees,” in Proceedings of the 2010 Gateway Computing Environments Workshop (GCE), (New Orleans, LA: IEEE; ), 1–8.
Minamoto T., Uchii K., Takahara T., Kitayoshi T., Tsuji S., Yamanaka H., et al. (2017). Nuclear internal transcribed spacer−1 as a sensitive genetic marker for environmental DNA studies in common carp Cyprinus carpio. Mol. Ecol. Resour. 17 324–333. 10.1111/1755-0998.12586 PubMed DOI
Moisy C., Schulman A. H., Kalendar R., Buchmann J. P., Pelsy F. (2014). The Tvv1 retrotransposon family is conserved between plant genomes separated by over 100 million years. Theor. Appl. Genet. 127 1223–1235. 10.1007/s00122-014-2293-z PubMed DOI
Mummenhoff K., Al-Shehbaz I. A., Bakker F. T., Linder H. P., Mühlhausen A. (2005). Phylogeny, morphological evolution, and speciation of endemic Brassicaceae genera in the Cape flora of southern Africa. Ann. Missouri Bot. Garden 92 400–424.
Mummenhoff K., Linder P., Friesen N., Bowman J. L., Lee J., Franzke A. (2004). Molecular evidence for bicontinental hybridogenous genomic constitution in Lepidium sensu stricto (Brassicaceae) species from Australia and New Zealand. Am. J. Bot. 91 254–261. 10.3732/ajb.91.2.254 PubMed DOI
Nguyen L.-T., Schmidt H. A., von Haeseler A., Minh B. Q. (2014). IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32 268–274. 10.1093/molbev/msu300 PubMed DOI PMC
Novák P., Neumann P., Macas J. (2010). Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data. BMC Bioinform. 11:378. 10.1186/1471-2105-11-378 PubMed DOI PMC
Novák P., Neumann P., Pech J., Steinhaisl J., MacAs J. (2013). RepeatExplorer: a Galaxy-based web server for genome-wide characterization of eukaryotic repetitive elements from next-generation sequence reads. Bioinformatics 29 792–793. 10.1093/bioinformatics/btt054 PubMed DOI
Novák P., Robledillo L. Á, Koblížková A., Vrbová I., Neumann P., Macas J. (2017). TAREAN: a computational tool for identification and characterization of satellite DNA from unassembled short reads. Nucleic Acids Res. 45:e111. 10.1093/nar/gkx257 PubMed DOI PMC
Oberlander K. C., Dreyer L. L., Goldblatt P., Suda J., Linder H. P. (2016). Species-rich and polyploid-poor: insights into the evolutionary role of whole-genome duplication from the Cape flora biodiversity hotspot. Am. J. Bot. 103 1336–1347. 10.3732/ajb.1500474 PubMed DOI
Paradis E., Schliep K. (2019). ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics 35 526–528. 10.1093/bioinformatics/bty633 PubMed DOI
Patro R., Duggal G., Love M. I., Irizarry R. A., Kingsford C. (2017). Salmon provides fast and bias-aware quantification of transcript expression. Nat. Meth. 14 417–419. 10.1038/nmeth.4197 PubMed DOI PMC
Poplin R., Ruano-Rubio V., DePristo M. A., Fennell T. J., Carneiro M. O., Van der Auwera G. A., et al. (2017). Scaling accurate genetic variant discovery to tens of thousands of samples. BioRxiv 201178. 10.1101/201178 DOI
R Core Team (2013). R: A Language and Environment for Statistical Computing. Available online at: https://www.R-project.org/
Rambaut A., Drummond A. J., Xie D., Baele G., Suchard M. A. (2018). Posterior summarization in Bayesian phylogenetics using Tracer 1.7. Syst. Biol. 67:901. 10.1093/sysbio/syy032 PubMed DOI PMC
Rannala B., Yang Z. (2007). Inferring speciation times under an episodic molecular clock. Syst. Biol. 56 453–466. 10.1080/10635150701420643 PubMed DOI
Renny-Byfield S., Kovarik A., Kelly L. J., Macas J., Novak P., Chase M. W., et al. (2013). Diploidization and genome size change in allopolyploids is associated with differential dynamics of low- and high-copy sequences. Plant J. 74 829–839. 10.1111/tpj.12168 PubMed DOI
Ronquist F., Teslenko M., Van Der Mark P., Ayres D. L., Darling A., Höhna S., et al. (2012). MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst. Biol. 61 539–542. 10.1093/sysbio/sys029 PubMed DOI PMC
Simão F. A., Waterhouse R. M., Ioannidis P., Kriventseva E. V, Zdobnov E. M. (2015). BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31 3210–3212. 10.1093/bioinformatics/btv351 PubMed DOI
Sinha S., Siggia E. D. (2005). Sequence turnover and tandem repeats in cis-regulatory modules in Drosophila. Mol. Biol. Evol. 22 874–885. 10.1093/molbev/msi090 PubMed DOI
Smith-Unna R., Boursnell C., Patro R., Hibberd J. M., Kelly S. (2016). TransRate: reference-free quality assessment of de novo transcriptome assemblies. Genome Res. 26 1134–1144. 10.1101/gr.196469.115 PubMed DOI PMC
Song L., Florea L. (2015). Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads. Gigascience 4:48. PubMed PMC
Sonnhammer E. L. L., Durbin R. (1995). A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis. Gene 167 GC1–GC10. PubMed
Talavera G., Castresana J. (2007). Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst. Biol. 56 564–577. 10.1080/10635150701472164 PubMed DOI
Temsch E. M., Greilhuber J., Krisai R. (2010). Genome size in liverworts. Preslia 82 63–80.
Thomas G. W. C., Ather S. H., Hahn M. W. (2017). Gene-tree reconciliation with MUL-trees to resolve polyploidy events. Syst. Biol. 66 1007–1018. 10.1093/sysbio/syx044 PubMed DOI
Towns J., Cockerill T., Dahan M., Foster I., Gaither K., Grimshaw A., et al. (2014). XSEDE: accelerating scientific discovery. Comput. Sci. Eng. 16 62–74.
Van Dongen S., Abreu-Goodger C. (2012). Using MCL to extract clusters from networks. Methods Mol. Biol. 804 281–295. 10.1007/978-1-61779-361-5_15 PubMed DOI
Vitales D., Garcia S., Dodsworth S. (2020). Reconstructing phylogenetic relationships based on repeat sequence similarities. Mol. Phylogen. Evol. 147:106766. 10.1016/j.ympev.2020.106766 PubMed DOI
Wang X., Liu C., Huang L., Bengtsson-Palme J., Chen H., Zhang J., et al. (2015). ITS 1: a DNA barcode better than ITS 2 in eukaryotes? Mol. Ecol. Resour. 15 573–586. 10.1111/1755-0998.12325 PubMed DOI
Wicker T., Gundlach H., Spannagl M., Uauy C., Borrill P., Ramírez-González R. H., et al. (2018). Impact of transposable elements on genome structure and evolution in bread wheat. Genome Biol. 19 1–18. PubMed PMC
Yang R.-H., Su J.-H., Shang J.-J., Wu Y.-Y., Li Y., Bao D.-P., et al. (2018). Evaluation of the ribosomal DNA internal transcribed spacer (ITS), specifically ITS1 and ITS2, for the analysis of fungal diversity by deep sequencing. PLoS One 13:e0206428. 10.1371/journal.pone.0206428 PubMed DOI PMC
Yang Y., Smith S. A. (2014). Orthology inference in nonmodel organisms using transcriptomes and low-coverage genomes: improving accuracy and matrix occupancy for phylogenomics. Mol. Biol. Evol. 31 3081–3092. 10.1093/molbev/msu245 PubMed DOI PMC
Zhang C., Rabiee M., Sayyari E., Mirarab S. (2018). ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees. BMC Bioinform. 19:153. 10.1186/s12859-018-2129-y PubMed DOI PMC
Zwaenepoel A., Van de Peer Y. (2019). wgd—simple command line tools for the analysis of ancient whole-genome duplications. Bioinformatics 35 2153–2155. 10.1093/bioinformatics/bty915 PubMed DOI PMC
An updated classification of the Brassicaceae (Cruciferae)