Relative performance of customized and universal probe sets in target enrichment: A case study in subtribe Malinae

. 2021 Jul ; 9 (7) : e11442. [epub] 20210723

Status PubMed-not-MEDLINE Jazyk angličtina Země Spojené státy americké Médium electronic-ecollection

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/pmid34336405

PREMISE: Custom probe design for target enrichment in phylogenetics is tedious and often hinders broader phylogenetic synthesis. The universal angiosperm probe set Angiosperms353 may be the solution. Here, we test the relative performance of Angiosperms353 on the Rosaceae subtribe Malinae in comparison with custom probes that we specifically designed for this clade. We then address the impact of bioinformatically altering the performance of Angiosperms353 by replacing the original probe sequences with orthologs extracted from the Malus domestica genome. METHODS: To evaluate the relative performance of these probe sets, we compared the enrichment efficiency, locus recovery, alignment length, proportion of parsimony-informative sites, proportion of potential paralogs, the topology and support of the resulting species trees, and the gene tree discordance. RESULTS: Locus recovery was highest for our custom Malinae probe set, and replacing the original Angiosperms353 sequences with a Malus representative improved the locus recovery relative to Angiosperms353. The proportion of parsimony-informative sites was similar between all probe sets, while the gene tree discordance was lower in the case of the custom probes. DISCUSSION: A custom probe set benefits from data completeness and can be tailored toward the specificities of the project of choice; however, Angiosperms353 was equally as phylogenetically informative as the custom probes. We therefore recommend using both a custom probe set and Angiosperms353 to facilitate large-scale systematic studies, where financially possible.

Zobrazit více v PubMed

Altschul, S. F. , Gish W., Miller W., Myers E. W., and Lipman D. J.. 1990. Basic local alignment search tool. Journal of Molecular Biology 215: 403–410. PubMed

Bankevich, A. , Nurk S., Antipov D., Gurevich A. A., Dvorkin M., Kulikov A. S., Lesin V. M., et al. 2012. SPAdes: A new genome assembly algorithm and its applications to single‐cell sequencing. Journal of Computational Biology 19: 455–477. PubMed PMC

Bolger, A. M. , Lohse M., and Usadel B.. 2014. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 30: 2114–2120. PubMed PMC

Borowiec, M. L. 2016. AMAS: A fast tool for alignment manipulation and computing of summary statistics. PeerJ 4: e1660. PubMed PMC

Breinholt, J. W. , Carey S. B., Tiley G. P., Davis E. C., Endara L., McDaniel S. F., Neves L. G., et al. 2021. A target enrichment probe set for resolving the flagellate plant tree of life. Applications in Plant Sciences 9: e11406. PubMed PMC

Buddenhagen, C. , Lemmon A. R., Lemmon E. M., Bruhl J., Cappa J., Clement W. L., Donoghue M. J., et al. 2016. Anchored phylogenomics of angiosperms I: Assessing the robustness of phylogenetic estimates. BioRxiv 086298 [Preprint] [posted 28 November 2016]. Available at: 10.1101/086298 [accessed August 2020]. DOI

Camacho, C. , Coulouris G., Avagyan V., Ma N., Papadopoulos J., Bealer K., and Madden T. L.. 2009. BLAST+: Architecture and applications. BMC Bioinformatics 10: 421. PubMed PMC

Carlsen, M. M. , Fér T., Schmickl R., Leong‐Škorničková J., Newman M., and Kress W. J.. 2018. Resolving the rapid plant radiation of early diverging lineages in the tropical Zingiberales: Pushing the limits of genomic data. Molecular Phylogenetics and Evolution 128: 55–68. PubMed

Chau, J. H. , Rahfeldt W. A., and Olmstead R. G.. 2018. Comparison of taxon‐specific versus general locus sets for targeted sequence capture in plant phylogenomics. Applications in Plant Sciences 6: e1032. PubMed PMC

Constantinides, B. , and Robertson D. L.. 2017. Kindel: indel‐aware consensus for nucleotide sequence alignments. Journal of Open Source Software 2: 282.

Cronn, R. , Knaus B. J., Liston A., Maughan P. J., Parks M., Syring J. V., and Udall J.. 2012. Targeted enrichment strategies for next‐generation plant biology. American Journal of Botany 99: 291–311. PubMed

Daccord, N. , Celton J.‐M., Linsmith G., Becker C., Choisne N., Schijlen E., van de Geest H., et al. 2017. High‐quality de novo assembly of the apple genome and methylome dynamics of early fruit development. Nature Genetics 49: 1099–1106. PubMed

Darriba, D. , Posada D., Kozlov A. M., Stamatakis A., Morel B., and Flouri T.. 2020. ModelTest‐NG: A new and scalable tool for the selection of DNA and protein evolutionary models. Molecular Biology and Evolution 37: 291–294. PubMed PMC

Dickinson, T. A. 2018. Sex and Rosaceae apomicts. Taxon 67: 1093–1107.

Dobeš, C. , Schmickl R., and Ufimov R.. 2021. Malinae481 exonic probe set. Dryad Dataset. Available at: 10.5061/dryad.j3tx95xc0. DOI

Doležel, J. , Greilhuber J., Lucretti S., Meister A., Lysák M. A., Nardi L., and Obermayer R.. 1998. Plant genome size estimation by flow cytometry: Inter‐laboratory comparison. Annals of Botany 82: 17–26.

Doležel, J. , Greilhuber J., and Suda J.. 2007. Estimation of nuclear DNA content in plants using flow cytometry. Nature Protocols 2: 2233–2244. PubMed

Evans, R. C. , and Campbell C. S.. 2002. The origin of the apple subfamily (Maloideae; Rosaceae) is clarified by DNA sequence data from duplicated GBSSI genes. American Journal of Botany 89: 1478–1484. PubMed

Fér, T. , and Schmickl R. E.. 2018. HybPhyloMaker: Target enrichment data analysis from raw reads to species trees. Evolutionary Bioinformatics 14: 117693431774261. PubMed PMC

Gaynor, M. L. , Fu C.‐N., Gao L.‐M., Lu L.‐M., Soltis D. E., and Soltis P. S.. 2020. Biogeography and ecological niche evolution in Diapensiaceae inferred from phylogenetic analysis. Journal of Systematics and Evolution 58: 646–662.

Heibl, C. 2008. PHYLOCH: R language tree plotting tools and interfaces to diverse phylogenetic software packages. Website http://www.christophheibl.de/Rpackages.html [accessed May 2020].

Hendriks, K. P. , Mandáková T., Hay N. M., Ly E., Hooft van Huysduynen A., Tamrakar R., Thomas S. K., et al. 2021. The best of both worlds: Combining lineage‐specific and universal bait sets in target‐enrichment hybridization reactions. Applications in Plant Sciences 9(7): e11438. PubMed PMC

Herrando‐Moraira, S. , Calleja J. A., Galbany‐Casals M., Garcia‐Jacas N., Liu J.‐Q., López‐Alvarado J., López‐Pujol J., et al. 2019. Nuclear and plastid DNA phylogeny of tribe Cardueae (Compositae) with Hyb‐Seq data: A new subtribal classification and a temporal diversification framework. Molecular Phylogenetics and Evolution 137: 313–332. PubMed

Heyduk, K. , Trapnell D. W., Barrett C. F., and Leebens‐Mack J.. 2016. Phylogenomic analyses of species relationships in the genus Sabal (Arecaceae) using targeted sequence capture. Biological Journal of the Linnean Society 117: 106–120.

Hoang, D. T. , Chernomor O., von Haeseler A., Minh B. Q., and Vinh L. S.. 2018. UFBoot2: Improving the ultrafast bootstrap approximation. Molecular Biology and Evolution 35: 518–522. PubMed PMC

Jantzen, J. R. , Amarasinghe P., Folk R. A., Reginato M., Michelangeli F. A., Soltis D. E., Cellinese N., and Soltis P. S.. 2020. A two‐tier bioinformatic pipeline to develop probes for target capture of nuclear loci with applications in Melastomataceae. Applications in Plant Sciences 8: e11345. PubMed PMC

Johnson, M. , Zaretskaya I., Raytselis Y., Merezhuk Y., McGinnis S., and Madden T. L.. 2008. NCBI BLAST: A better web interface. Nucleic Acids Research 36: W5–W9. PubMed PMC

Johnson, M. G. , Gardner E. M., Liu Y., Medina R., Goffinet B., Shaw A. J., Zerega N. J. C., and Wickett N. J.. 2016. HybPiper: Extracting coding sequence and introns for phylogenetics from high‐throughput sequencing reads using target enrichment. Applications in Plant Sciences 4: 1600016. PubMed PMC

Johnson, M. G. , Pokorny L., Dodsworth S., Botigué L. R., Cowan R. S., Devault A., Eiserhardt W. L., et al. 2019. A universal probe set for targeted sequencing of 353 nuclear genes from any flowering plant designed using k‐medoids clustering. Systematic Biology 68: 594–606. PubMed PMC

Jung, S. , Lee T., Cheng C.‐H., Buble K., Zheng P., Yu J., Humann J., et al. 2019. 15 years of GDR: New data and functionality in the Genome Database for Rosaceae. Nucleic Acids Research 47: D1137–D1145. PubMed PMC

Junier, T. , and Zdobnov E. M.. 2010. The Newick utilities: High‐throughput phylogenetic tree processing in the UNIX shell. Bioinformatics 26: 1669–1670. PubMed PMC

Kadlec, M. , Bellstedt D. U., Maitre N. C. L., and Pirie M. D.. 2017. Targeted NGS for species level phylogenomics: “made to measure” or “one size fits all”? PeerJ 5: e3569. PubMed PMC

Kalkman, C. 1988. The phylogeny of the Rosaceae. Botanical Journal of the Linnean Society 98: 37–59.

Kalyaanamoorthy, S. , Minh B. Q., Wong T. K. F., von Haeseler A., and Jermiin L. S.. 2017. ModelFinder: Fast model selection for accurate phylogenetic estimates. Nature Methods 14: 587–589. PubMed PMC

Kamneva, O. K. , Syring J., Liston A., and Rosenberg N. A.. 2017. Evaluating allopolyploid origins in strawberries (Fragaria) using haplotypes generated from target capture sequencing. BMC Evolutionary Biology 17: 180. PubMed PMC

Katoh, K. , and Standley D. M.. 2013. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Molecular Biology and Evolution 30: 772–780. PubMed PMC

Kent, W. J. 2002. BLAT: The BLAST‐like alignment tool. Genome Research 12: 656–664. PubMed PMC

Kozlov, A. M. , Darriba D., Flouri T., Morel B., and Stamatakis A.. 2019. RAxML‐NG: A fast, scalable and user‐friendly tool for maximum likelihood phylogenetic inference. Bioinformatics 35: 4453–4455. PubMed PMC

Larridon, I. , Villaverde T., Zuntini A. R., Pokorny L., Brewer G. E., Epitawalage N., Fairlie I., et al. 2020. Tackling rapid radiations with targeted sequencing. Frontiers in Plant Science 10: 1655. PubMed PMC

Li, H. 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA‐MEM. ArXiv 1303.3997 [Preprint]. [q‐bio.GN] [posted 26 May 2013]. Available at: https://arxiv.org/abs/1303.3997 [accessed May 2020].

Li, H. , and Durbin R.. 2009. Fast and accurate short read alignment with Burrows‐Wheeler transform. Bioinformatics 25: 1754–1760. PubMed PMC

Li, H. , and Durbin R.. 2010. Fast and accurate long‐read alignment with Burrows‐Wheeler transform. Bioinformatics 26: 589–595. PubMed PMC

Linsmith, G. , Rombauts S., Montanari S., Deng C. H., Celton J.‐M., Guérif P., Liu C., et al. 2019. Pseudo‐chromosome–length genome assembly of a double haploid “Bartlett” pear (Pyrus communis L.). GigaScience 8: giz138. PubMed PMC

Lipnerová, I. , Bureš P., Horová L., and Šmarda P.. 2013. Evolution of genome size in Carex (Cyperaceae) in relation to chromosome number and genomic base composition. Annals of Botany 111: 79–94. PubMed PMC

Liu, B.‐B. , Hong D.‐Y., Zhou S.‐L., Xu C., Dong W.‐P., Johnson G., and Wen J.. 2019. Phylogenomic analyses of the Photinia complex support the recognition of a new genus Phippsiomeles and the resurrection of a redefined Stranvaesia in Maleae (Rosaceae). Journal of Systematics and Evolution 57: 678–694.

Liu, B.‐B. , Campbell C. S., Hong D.‐Y., and Wen J.. 2020. Phylogenetic relationships and chloroplast capture in the Amelanchier‐Malacomeles‐Peraphyllum clade (Maleae, Rosaceae): Evidence from chloroplast genome and nuclear ribosomal DNA data using genome skimming. Molecular Phylogenetics and Evolution 147: 106784. PubMed

Lo, E. Y. Y. , and Donoghue M. J.. 2012. Expanded phylogenetic and dating analyses of the apples and their relatives (Pyreae, Rosaceae). Molecular Phylogenetics and Evolution 63: 230–243. PubMed

Mandel, J. R. , Dikow R. B., Funk V. A., Masalia R. R., Staton S. E., Kozik A., Michelmore R. W., et al. 2014. A target enrichment method for gathering phylogenetic information from hundreds of loci: An example from the Compositae. Applications in Plant Sciences 2: 1300085. PubMed PMC

McLay, T. G. B. , Birch J. L., Gunn B. F., Ning W., Tate J. A., Nauheimer L., Joyce E. M., et al. 2021. New targets acquired: Improving locus recovery from the Angiosperms353 probe set. Applications in Plant Sciences 9(7): e11420. PubMed PMC

Medina, R. , Johnson M. G., Liu Y., Wickett N. J., Shaw A. J., and Goffinet B.. 2019. Phylogenomic delineation of Physcomitrium (Bryophyta: Funariaceae) based on targeted sequencing of nuclear exons and their flanking regions rejects the retention of Physcomitrella, Physcomitridium and Aphanorrhegma . Journal of Systematics and Evolution 57: 404–417.

Morales‐Briones, D. F. , Liston A., and Tank D. C.. 2018. Phylogenomic analyses reveal a deep history of hybridization and polyploidy in the Neotropical genus Lachemilla (Rosaceae). New Phytologist 218: 1668–1684. PubMed

Murphy, B. , Forest F., Barraclough T., Rosindell J., Bellot S., Cowan R., Golos M., et al. 2020. A phylogenomic analysis of Nepenthes (Nepenthaceae). Molecular Phylogenetics and Evolution 144: 106668. PubMed

Nguyen, L.‐T. , Schmidt H. A., von Haeseler A., and Minh B. Q.. 2015. IQ‐TREE: A fast and effective stochastic algorithm for estimating maximum‐likelihood phylogenies. Molecular Biology and Evolution 32: 268–274. PubMed PMC

Nicholls, J. A. , Pennington R. T., Koenen E. J. M., Hughes C. E., Hearn J., Bunnefeld L., Dexter K. G., et al. 2015. Using targeted enrichment of nuclear genes to increase phylogenetic resolution in the neotropical rain forest genus Inga (Leguminosae: Mimosoideae). Frontiers in Plant Science 6: 710. PubMed PMC

Ogutcen, E. , Christe C., Nishii K., Salamin N., Möller M., and Perret M.. 2021. Phylogenomics of Gesneriaceae using targeted capture of nuclear genes. Molecular Phylogenetics and Evolution 157: 107068. PubMed

Paradis, E. , and Schliep K.. 2019. ape 5.0: An environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics 35: 526–528. PubMed

Phipps, J. B. , Robertson K. R., Smith P. G., and Rohrer J. R.. 1990. A checklist of the subfamily Maloideae (Rosaceae). Canadian Journal of Botany 68: 2209–2269.

Potter, D. , Gao F., Bortiri P. E., Oh S.‐H., and Baggett S.. 2002. Phylogenetic relationships in Rosaceae inferred from chloroplast matK and trnL‐trnF nucleotide sequence data. Plant Systematics and Evolution 231: 77–89.

R Core Team . 2019. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Website http://www.R‐project.org/ [accessed May 2020].

Robertson, K. R. , Phipps J. B., Rohrer J. R., and Smith P. G.. 1991. A synopsis of genera in Maloideae (Rosaceae). Systematic Botany 16: 376–394.

Schmickl, R. , Liston A., Zeisek V., Oberlander K., Weitemier K., Straub S. C. K., Cronn R. C., et al. 2016. Phylogenetic marker development for target enrichment from transcriptome and genome skim data: The pipeline and its application in southern African Oxalis (Oxalidaceae). Molecular Ecology Resources 16: 1124–1135. PubMed

Shee, Z. Q. , Frodin D. G., Cámara‐Leret R., and Pokorny L.. 2020. Reconstructing the complex evolutionary history of the Papuasian Schefflera radiation through herbariomics. Frontiers in Plant Science 11: 258. PubMed PMC

Smith, S. A. , Moore M. J., Brown J. W., and Yang Y.. 2015. Analysis of phylogenomic datasets reveals conflict, concordance, and gene duplications with examples from animals and plants. BMC Evolutionary Biology 15: 150. PubMed PMC

Smith, M. L. , and Hahn M. W.. 2021. New approaches for inferring phylogenies in the presence of paralogs. Trends in Genetics 37: 174–187. PubMed

Straub, S. C. K. , Boutte J., Fishbein M., and Livshultz T.. 2020. Enabling evolutionary studies at multiple scales in Apocynaceae through Hyb‐Seq. Applications in Plant Sciences 8: e11400. PubMed PMC

Suda, J. , Krahulcová A., Trávníček P., and Krahulec F.. 2006. Ploidy level versus DNA ploidy level: An appeal for consistent terminology. Taxon 55: 447–450.

Talent, N. , and Dickinson T. A.. 2005. Polyploidy in Crataegus and Mespilus (Rosaceae, Maloideae): Evolutionary inferences from flow cytometry of nuclear DNA amounts. Canadian Journal of Botany 83: 1268–1304.

Tange, O. 2018. GNU Parallel 2018 [posted 27 April 2018]. Available at Zenodo repository: 10.5281/zenodo.1146014 [accessed May 2020]. DOI

Ufimov, R. A. , and Dickinson T. A.. 2020. Infrageneric nomenclature adjustments in Crataegus L. (Maleae, Rosaceae). Phytologia 102: 177–199.

Van Andel, T. , Veltman M. A., Bertin A., Maat H., Polime T., Hille Ris Lambers D., Tjoe Awie J., et al. 2019. Hidden rice diversity in the Guianas. Frontiers in Plant Science 10: 1161. PubMed PMC

Velasco, R. , Zharkikh A., Affourtit J., Dhingra A., Cestaro A., Kalyanaraman A., Fontana P., et al. 2010. The genome of the domesticated apple (Malus × domestica Borkh.). Nature Genetics 42: 833–839. PubMed

Villaverde, T. , Pokorny L., Olsson S., Rincón‐Barrado M., Johnson M. G., Gardner E. M., Wickett N. J., et al. 2018. Bridging the micro‐ and macroevolutionary levels in phylogenomics: Hyb‐Seq solves relationships from populations to species and above. New Phytologist 220: 636–650. PubMed

Weitemier, K. , Straub S. C. K., Cronn R. C., Fishbein M., Schmickl R., McDonnell A., and Liston A.. 2014. Hyb‐Seq: Combining target enrichment and genome skimming for plant phylogenomics. Applications in Plant Sciences 2: 1400042. PubMed PMC

Wu, J. , Wang Z., Shi Z., Zhang S., Ming R., Zhu S., Khan M. A., et al. 2013. The genome of the pear (Pyrus bretschneideri Rehd.). Genome Research 23: 396–408. PubMed PMC

Xiang, Y. , Huang C.‐H., Hu Y., Wen J., Li S., Yi T., Chen H., et al. 2017. Evolution of Rosaceae fruit types based on nuclear phylogeny in the context of geological times and genome duplication. Molecular Biology and Evolution 34: 262–281. PubMed PMC

Xu, H. , Luo X., Qian J., Pang X., Song J., Qian G., Chen J., and Chen S.. 2012. FastUniq: A fast de novo duplicates removal tool for paired short reads. PLoS ONE 7: e52249. PubMed PMC

Zhang, C. , Rabiee M., Sayyari E., and Mirarab S.. 2018. ASTRAL‐III: Polynomial time species tree reconstruction from partially resolved gene trees. BMC Bioinformatics 19: 153. PubMed PMC

Zhang, L. , Hu J., Han X., Li J., Gao Y., Richards C. M., Zhang C., et al. 2019. A high‐quality apple genome assembly reveals the association of a retrotransposon and red fruit colour. Nature Communications 10: 1494. PubMed PMC

Zobrazit více v PubMed

Dryad
10.5061/dryad.j3tx95xc0

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...