Evolution of Tandem Repeats Is Mirroring Post-polyploid Cladogenesis in Heliophila (Brassicaceae)

. 2020 ; 11 () : 607893. [epub] 20210112

Status PubMed-not-MEDLINE Jazyk angličtina Země Švýcarsko Médium electronic-ecollection

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/pmid33510751

The unigeneric tribe Heliophileae encompassing more than 100 Heliophila species is morphologically the most diverse Brassicaceae lineage. The tribe is endemic to southern Africa, confined chiefly to the southwestern South Africa, home of two biodiversity hotspots (Cape Floristic Region and Succulent Karoo). The monospecific Chamira (C. circaeoides), the only crucifer species with persistent cotyledons, is traditionally retrieved as the closest relative of Heliophileae. Our transcriptome analysis revealed a whole-genome duplication (WGD) ∼26.15-29.20 million years ago, presumably preceding the Chamira/Heliophila split. The WGD was then followed by genome-wide diploidization, species radiations, and cladogenesis in Heliophila. The expanded phylogeny based on nuclear ribosomal DNA internal transcribed spacer (ITS) uncovered four major infrageneric clades (A-D) in Heliophila and corroborated the sister relationship between Chamira and Heliophila. Herein, we analyzed how the diploidization process impacted the evolution of repetitive sequences through low-coverage whole-genome sequencing of 15 Heliophila species, representing the four clades, and Chamira. Despite the firmly established infrageneric cladogenesis and different ecological life histories (four perennials vs. 11 annual species), repeatome analysis showed overall comparable evolution of genome sizes (288-484 Mb) and repeat content (25.04-38.90%) across Heliophila species and clades. Among Heliophila species, long terminal repeat (LTR) retrotransposons were the predominant components of the analyzed genomes (11.51-22.42%), whereas tandem repeats had lower abundances (1.03-12.10%). In Chamira, the tandem repeat content (17.92%, 16 diverse tandem repeats) equals the abundance of LTR retrotransposons (16.69%). Among the 108 tandem repeats identified in Heliophila, only 16 repeats were found to be shared among two or more species; no tandem repeats were shared by Chamira and Heliophila genomes. Six "relic" tandem repeats were shared between any two different Heliophila clades by a common descent. Four and six clade-specific repeats shared among clade A and C species, respectively, support the monophyly of these two clades. Three repeats shared by all clade A species corroborate the recent diversification of this clade revealed by plastome-based molecular dating. Phylogenetic analysis based on repeat sequence similarities separated the Heliophila species to three clades [A, C, and (B+D)], mirroring the post-polyploid cladogenesis in Heliophila inferred from rDNA ITS and plastome sequences.

Erratum v

PubMed

Zobrazit více v PubMed

Al-Shehbaz I. A. (2012). A generic and tribal synopsis of the Brassicaceae (Cruciferae). Taxon 61 931–954. 10.1002/tax.615002 DOI

Altschul S. F., Gish W., Miller W., Myers E. W., Lipman D. J. (1990). Basic local alignment search tool. J. Mol. Biol. 215 403–410. PubMed

Andrews S. (2010). FastQC: A Quality Control Tool for High Throughput Sequence Data. Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/

Benson G. (1998). “An algorithm for finding tandem repeats of unspecified pattern size,” in Proceedings of the Second Annual International Conference on Computational Molecular Biology, (New York, NY: ACM; ), 20–29. 10.1145/279069.279079 DOI

Bolger A. M., Lohse M., Usadel B. (2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30 2114–2120. 10.1093/bioinformatics/btu170 PubMed DOI PMC

Bolsheva N. L., Melnikova N. V., Kirov I. V., Dmitriev A. A., Krasnov G. S., Amosova ÀV., et al. (2019). Characterization of repeated DNA sequences in genomes of blue-flowered flax. BMC Evol. Biol. 19:49. 10.1186/s12862-019-1375-6 PubMed DOI PMC

Brown J. W., Walker J. F., Smith S. A. (2017). Phyx: phylogenetic tools for unix. Bioinformatics 33 1886–1888. 10.1093/bioinformatics/btx063 PubMed DOI PMC

Cechova M., Harris R. S., Tomaszkiewicz M., Arbeithuber B., Chiaromonte F., Makova K. D. (2019). High satellite repeat turnover in great apes studied with short-and long-read technologies. Mol. Biol. Evol. 36 2415–2431. 10.1093/molbev/msz156 PubMed DOI PMC

Davidson N. M., Oshlack A. (2014). Corset: enabling differential gene expression analysis for de novo assembled transcriptomes. Genome Biol. 15 1–14. PubMed PMC

Dierckxsens N., Mardulyn P., Smits G. (2016). NOVOPlasty: de novo assembly of organelle genomes from whole genome data. Nucleic Acids Res. 45:e18. PubMed PMC

Dodsworth S., Chase M. W., Kelly L. J., Leitch I. J., Macas J., Novák P., et al. (2014). Genomic repeat abundances contain phylogenetic signal. Syst. Biol. 64 112–126. 10.1093/sysbio/syu080 PubMed DOI PMC

Dodsworth S., Chase M. W., Särkinen T., Knapp S., Leitch A. R. (2016). Using genomic repeats for phylogenomics: a case study in wild tomatoes (Solanum section Lycopersicon: Solanaceae). Biol. J. Linn. Soc. 117 96–105. 10.1111/bij.12612 DOI

Dodsworth S., Jang T.-S., Struebig M., Chase M. W., Weiss-Schneeweiss H., Leitch A. R. (2017). Genome-wide repeat dynamics reflect phylogenetic distance in closely related allotetraploid Nicotiana (Solanaceae). Plant Syst. Evol. 303 1013–1020. 10.1007/s00606-016-1356-9 PubMed DOI PMC

Doležel J., Greilhuber J., Suda J. (2007). Estimation of nuclear DNA content in plants using flow cytometry. Nat. Protoc. 2:2233. 10.1038/nprot.2007.310 PubMed DOI

Doronina L., Churakov G., Kuritzin A., Shi J., Baertsch R., Clawson H., et al. (2017). Speciation network in Laurasiatheria: retrophylogenomic signals. Genome Res. 27 997–1003. 10.1101/gr.210948.116 PubMed DOI PMC

Emms D. M., Kelly S. (2015). OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 16:157. PubMed PMC

Franzke A., Koch M. A., Mummenhoff K. (2016). Turnip time travels: age estimates in Brassicaceae. Trends Plant Sci. 21 554–561. 10.1016/j.tplants.2016.01.024 PubMed DOI

Fu L., Niu B., Zhu Z., Wu S., Li W. (2012). CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28 3150–3152. 10.1093/bioinformatics/bts565 PubMed DOI PMC

García-Robledo C., Erickson D. L., Staines C. L., Erwin T. L., Kress W. J. (2013). Tropical plant–herbivore networks: reconstructing species interactions using DNA barcodes. PLoS One 8:e52967. 10.1371/journal.pone.0052967 PubMed DOI PMC

Garrido-Ramos M. A. (2015). Satellite DNA in plants: more than just rubbish. Cytogenet. Genome Res. 146 153–170. 10.1159/000437008 PubMed DOI

Garrido-Ramos M. A. (2017). Satellite DNA: an evolving topic. Genes 8:230. 10.3390/genes8090230 PubMed DOI PMC

Guo X., Liu J., Hao G., Zhang L., Mao K., Wang X., et al. (2017). Plastome phylogeny and early diversification of Brassicaceae. BMC Genomics 18:176. 10.1186/s12864-017-3555-3 PubMed DOI PMC

Haas B., Papanicolaou A. (2016). TransDecoder (Find Coding Regions Within Transcripts). Available online at: https://sourceforge.net/projects/transdecoder/

Haas B. J., Papanicolaou A., Yassour M., Grabherr M., Blood P. D., Bowden J., et al. (2013). De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8 1494–1512. 10.1038/nprot.2013.084 PubMed DOI PMC

Harkess A., Mercati F., Abbate L., McKain M., Pires J. C., Sala T., et al. (2016). Retrotransposon proliferation coincident with the evolution of dioecy in Asparagus. G3 6 2679–2685. 10.1534/g3.116.030239 PubMed DOI PMC

Henikoff S., Ahmad K., Malik H. S. (2001). The centromere paradox: stable inheritance with rapidly evolving DNA. Science 293 1098–1102. 10.1126/science.1062939 PubMed DOI

Hohmann N., Wolf E. M., Lysak M. A., Koch M. A. (2015). A time-calibrated road map of Brassicaceae species radiation and evolutionary history. Plant Cell 27 2770–2784. PubMed PMC

Huang D. I., Cronk Q. C. B. (2015). Plann: a command-line application for annotating plastome sequences. Appl. Plant Sci. 3:1500026. 10.3732/apps.1500026 PubMed DOI PMC

Huson D. H., Bryant D. (2006). Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23 254–267. 10.1093/molbev/msj030 PubMed DOI

Jurka J., Bao W., Kojima K. K. (2011). Families of transposable elements, population structure and the origin of species. Biol. Direct 6:44. 10.1186/1745-6150-6-44 PubMed DOI PMC

Kagale S., Robinson S. J., Nixon J., Xiao R., Huebert T., Condie J., et al. (2014). Polyploid evolution of the Brassicaceae during the Cenozoic era. Plant Cell 26 2777–2791. 10.1105/tpc.114.126391 PubMed DOI PMC

Kalyaanamoorthy S., Minh B. Q., Wong T. K. F., von Haeseler A., Jermiin L. S. (2017). ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Methods 14 587–589. 10.1038/nmeth.4285 PubMed DOI PMC

Katoh K., Standley D. M. (2013). MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30 772–780. 10.1093/molbev/mst010 PubMed DOI PMC

Kiefer C., Willing E.-M., Jiao W.-B., Sun H., Piednoël M., Hümann U., et al. (2019). Interspecies association mapping links reduced CG to TG substitution rates to the loss of gene-body methylation. Nat. Plants 5 846–855. 10.1038/s41477-019-0486-9 PubMed DOI

Kohany O., Gentles A. J., Hankus L., Jurka J. (2006). Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor. BMC Bioinform. 7:474. 10.1186/1471-2105-7-474 PubMed DOI PMC

Kolde R., Kolde M. R. (2015). Package ‘Pheatmap.’ R Package 1, 790. Available online at: https://www.rdocumentation.org/packages/pheatmap/versions/1.0.12/topics/pheatmap

Koukalova B., Moraes A. P., Renny-Byfield S., Matyasek R., Leitch A. R., Kovarik A. (2010). Fall and rise of satellite repeats in allopolyploids of Nicotiana over c. 5 million years. New Phytol. 186 148–160. 10.1111/j.1469-8137.2009.03101.x PubMed DOI

Kumwenda M. W. (2003). A Palynological Study of Heliophila (Brassicaceae) in Southern Africa. M. Sc. Thesis, University of Stellenbosch, Stellenbosch.

Lanfear R., Frandsen P. B., Wright A. M., Senfeld T., Calcott B. (2016). PartitionFinder 2: new methods for selecting partitioned models of evolution for molecular and morphological phylogenetic analyses. Mol. Biol. Evol. 34 772–773. PubMed

Langmead B., Salzberg S. L. (2012). Fast gapped-read alignment with Bowtie 2. Nat. Meth. 9:357. 10.1038/nmeth.1923 PubMed DOI PMC

Lysak M. A., Koch M. A. (2011). “Phylogeny, genome, and karyotype evolution of crucifers (Brassicaceae),” in Genetics and Genomics of the Brassicaceae, eds Schmidt R., Bancroft I. (New York, NY: Springer; ), 1–31. 10.1007/978-1-4419-7118-0_1 DOI

Macas J., Kejnovský E., Neumann P., Novák P., Koblížková A., Vyskot B. (2011). Next generation sequencing-based analysis of repetitive DNA in the model dioceous plant Silene latifolia. PLoS One 6:e27335. 10.1371/journal.pone.0027335 PubMed DOI PMC

Mandáková T., Mummenhoff K., Al-Shehbaz I. A., Mucina L., Mühlhausen A., Lysak M. A. (2012). Whole-genome triplication and species radiation in the southern African tribe Heliophileae (Brassicaceae). Taxon 61 989–1000. 10.1002/tax.615006 DOI

Mandáková T., Winter P., Al-Shehbaz I. A., Mucina L., Mummenhoff K., Lysak M. A., et al. (2015). “Brassicaceae. IAPT/IOPB chromosome data 19,” in Taxon, Vol. 64, ed. Marhold K. (Hoboken, NJ: Wiley; ), 1068–1074.

Mandáková T., Lysak M. A. (2016a). Chromosome preparation for cytogenetic analyses in Arabidopsis. Curr. Protoc. Plant Biol. 1 43–51. 10.1002/cppb.20009 PubMed DOI

Mandáková T., Lysak M. A. (2016b). Painting of Arabidopsis chromosomes with chromosome-specific BAC clones. Curr. Protoc. Plant Biol. 1 359–371. 10.1002/cppb.20022 PubMed DOI

Mandáková T., Li Z., Barker M. S., Lysak M. A. (2017). Diverse genome organization following 13 independent mesopolyploid events in Brassicaceae contrasts with convergent patterns of gene retention. Plant J. 91 3–21. 10.1111/tpj.13553 PubMed DOI

Marais W. (1970). “Cruciferae,” in Flora of Southern Africa, Vol. 13, eds Codd L. E., De Winter B., Killick D. J., Rycroft H. B. (Pretoria: Government Printer; ), 1–118.

McCann J., Macas J., Novák P., Stuessy T. F., Villaseñor J. L., Weiss-Schneeweiss H. (2020). Differential genome size and repetitive DNA evolution in diploid species of Melampodium sect. Melampodium (Asteraceae). Front. Plant Sci. 11:362. 10.3389/fpls.2020.00362 PubMed DOI PMC

Melters D. P., Bradnam K. R., Young H. A., Telis N., May M. R., Ruby J. G., et al. (2013). Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution. Genome Biol. 14:R10. 10.1186/gb-2013-14-1-r10 PubMed DOI PMC

Meraldi P., McAinsh A. D., Rheinbay E., Sorger P. K. (2006). Phylogenetic and structural analysis of centromeric DNA and kinetochore proteins. Genome Biol. 7:R23. 10.1186/gb-2006-7-3-r23 PubMed DOI PMC

Miller M. A., Pfeiffer W., Schwartz T. (2010). “Creating the CIPRES science gateway for inference of large phylogenetic trees,” in Proceedings of the 2010 Gateway Computing Environments Workshop (GCE), (New Orleans, LA: IEEE; ), 1–8.

Minamoto T., Uchii K., Takahara T., Kitayoshi T., Tsuji S., Yamanaka H., et al. (2017). Nuclear internal transcribed spacer−1 as a sensitive genetic marker for environmental DNA studies in common carp Cyprinus carpio. Mol. Ecol. Resour. 17 324–333. 10.1111/1755-0998.12586 PubMed DOI

Moisy C., Schulman A. H., Kalendar R., Buchmann J. P., Pelsy F. (2014). The Tvv1 retrotransposon family is conserved between plant genomes separated by over 100 million years. Theor. Appl. Genet. 127 1223–1235. 10.1007/s00122-014-2293-z PubMed DOI

Mummenhoff K., Al-Shehbaz I. A., Bakker F. T., Linder H. P., Mühlhausen A. (2005). Phylogeny, morphological evolution, and speciation of endemic Brassicaceae genera in the Cape flora of southern Africa. Ann. Missouri Bot. Garden 92 400–424.

Mummenhoff K., Linder P., Friesen N., Bowman J. L., Lee J., Franzke A. (2004). Molecular evidence for bicontinental hybridogenous genomic constitution in Lepidium sensu stricto (Brassicaceae) species from Australia and New Zealand. Am. J. Bot. 91 254–261. 10.3732/ajb.91.2.254 PubMed DOI

Nguyen L.-T., Schmidt H. A., von Haeseler A., Minh B. Q. (2014). IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32 268–274. 10.1093/molbev/msu300 PubMed DOI PMC

Novák P., Neumann P., Macas J. (2010). Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data. BMC Bioinform. 11:378. 10.1186/1471-2105-11-378 PubMed DOI PMC

Novák P., Neumann P., Pech J., Steinhaisl J., MacAs J. (2013). RepeatExplorer: a Galaxy-based web server for genome-wide characterization of eukaryotic repetitive elements from next-generation sequence reads. Bioinformatics 29 792–793. 10.1093/bioinformatics/btt054 PubMed DOI

Novák P., Robledillo L. Á, Koblížková A., Vrbová I., Neumann P., Macas J. (2017). TAREAN: a computational tool for identification and characterization of satellite DNA from unassembled short reads. Nucleic Acids Res. 45:e111. 10.1093/nar/gkx257 PubMed DOI PMC

Oberlander K. C., Dreyer L. L., Goldblatt P., Suda J., Linder H. P. (2016). Species-rich and polyploid-poor: insights into the evolutionary role of whole-genome duplication from the Cape flora biodiversity hotspot. Am. J. Bot. 103 1336–1347. 10.3732/ajb.1500474 PubMed DOI

Paradis E., Schliep K. (2019). ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics 35 526–528. 10.1093/bioinformatics/bty633 PubMed DOI

Patro R., Duggal G., Love M. I., Irizarry R. A., Kingsford C. (2017). Salmon provides fast and bias-aware quantification of transcript expression. Nat. Meth. 14 417–419. 10.1038/nmeth.4197 PubMed DOI PMC

Poplin R., Ruano-Rubio V., DePristo M. A., Fennell T. J., Carneiro M. O., Van der Auwera G. A., et al. (2017). Scaling accurate genetic variant discovery to tens of thousands of samples. BioRxiv 201178. 10.1101/201178 DOI

R Core Team (2013). R: A Language and Environment for Statistical Computing. Available online at: https://www.R-project.org/

Rambaut A., Drummond A. J., Xie D., Baele G., Suchard M. A. (2018). Posterior summarization in Bayesian phylogenetics using Tracer 1.7. Syst. Biol. 67:901. 10.1093/sysbio/syy032 PubMed DOI PMC

Rannala B., Yang Z. (2007). Inferring speciation times under an episodic molecular clock. Syst. Biol. 56 453–466. 10.1080/10635150701420643 PubMed DOI

Renny-Byfield S., Kovarik A., Kelly L. J., Macas J., Novak P., Chase M. W., et al. (2013). Diploidization and genome size change in allopolyploids is associated with differential dynamics of low- and high-copy sequences. Plant J. 74 829–839. 10.1111/tpj.12168 PubMed DOI

Ronquist F., Teslenko M., Van Der Mark P., Ayres D. L., Darling A., Höhna S., et al. (2012). MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst. Biol. 61 539–542. 10.1093/sysbio/sys029 PubMed DOI PMC

Simão F. A., Waterhouse R. M., Ioannidis P., Kriventseva E. V, Zdobnov E. M. (2015). BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31 3210–3212. 10.1093/bioinformatics/btv351 PubMed DOI

Sinha S., Siggia E. D. (2005). Sequence turnover and tandem repeats in cis-regulatory modules in Drosophila. Mol. Biol. Evol. 22 874–885. 10.1093/molbev/msi090 PubMed DOI

Smith-Unna R., Boursnell C., Patro R., Hibberd J. M., Kelly S. (2016). TransRate: reference-free quality assessment of de novo transcriptome assemblies. Genome Res. 26 1134–1144. 10.1101/gr.196469.115 PubMed DOI PMC

Song L., Florea L. (2015). Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads. Gigascience 4:48. PubMed PMC

Sonnhammer E. L. L., Durbin R. (1995). A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis. Gene 167 GC1–GC10. PubMed

Talavera G., Castresana J. (2007). Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst. Biol. 56 564–577. 10.1080/10635150701472164 PubMed DOI

Temsch E. M., Greilhuber J., Krisai R. (2010). Genome size in liverworts. Preslia 82 63–80.

Thomas G. W. C., Ather S. H., Hahn M. W. (2017). Gene-tree reconciliation with MUL-trees to resolve polyploidy events. Syst. Biol. 66 1007–1018. 10.1093/sysbio/syx044 PubMed DOI

Towns J., Cockerill T., Dahan M., Foster I., Gaither K., Grimshaw A., et al. (2014). XSEDE: accelerating scientific discovery. Comput. Sci. Eng. 16 62–74.

Van Dongen S., Abreu-Goodger C. (2012). Using MCL to extract clusters from networks. Methods Mol. Biol. 804 281–295. 10.1007/978-1-61779-361-5_15 PubMed DOI

Vitales D., Garcia S., Dodsworth S. (2020). Reconstructing phylogenetic relationships based on repeat sequence similarities. Mol. Phylogen. Evol. 147:106766. 10.1016/j.ympev.2020.106766 PubMed DOI

Wang X., Liu C., Huang L., Bengtsson-Palme J., Chen H., Zhang J., et al. (2015). ITS 1: a DNA barcode better than ITS 2 in eukaryotes? Mol. Ecol. Resour. 15 573–586. 10.1111/1755-0998.12325 PubMed DOI

Wicker T., Gundlach H., Spannagl M., Uauy C., Borrill P., Ramírez-González R. H., et al. (2018). Impact of transposable elements on genome structure and evolution in bread wheat. Genome Biol. 19 1–18. PubMed PMC

Yang R.-H., Su J.-H., Shang J.-J., Wu Y.-Y., Li Y., Bao D.-P., et al. (2018). Evaluation of the ribosomal DNA internal transcribed spacer (ITS), specifically ITS1 and ITS2, for the analysis of fungal diversity by deep sequencing. PLoS One 13:e0206428. 10.1371/journal.pone.0206428 PubMed DOI PMC

Yang Y., Smith S. A. (2014). Orthology inference in nonmodel organisms using transcriptomes and low-coverage genomes: improving accuracy and matrix occupancy for phylogenomics. Mol. Biol. Evol. 31 3081–3092. 10.1093/molbev/msu245 PubMed DOI PMC

Zhang C., Rabiee M., Sayyari E., Mirarab S. (2018). ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees. BMC Bioinform. 19:153. 10.1186/s12859-018-2129-y PubMed DOI PMC

Zwaenepoel A., Van de Peer Y. (2019). wgd—simple command line tools for the analysis of ancient whole-genome duplications. Bioinformatics 35 2153–2155. 10.1093/bioinformatics/bty915 PubMed DOI PMC

Najít záznam

Citační ukazatele

Nahrávání dat ...

    Možnosti archivace