Hyb-Seq: Combining target enrichment and genome skimming for plant phylogenomics
Status PubMed-not-MEDLINE Jazyk angličtina Země Spojené státy americké Médium electronic-ecollection
Typ dokumentu časopisecké články
PubMed
25225629
PubMed Central
PMC4162667
DOI
10.3732/apps.1400042
PII: apps1400042
Knihovny.cz E-zdroje
- Klíčová slova
- Hyb-Seq, genome skimming, nuclear loci, phylogenomics, species tree, target enrichment,
- Publikační typ
- časopisecké články MeSH
PREMISE OF THE STUDY: Hyb-Seq, the combination of target enrichment and genome skimming, allows simultaneous data collection for low-copy nuclear genes and high-copy genomic targets for plant systematics and evolution studies. • METHODS AND RESULTS: Genome and transcriptome assemblies for milkweed (Asclepias syriaca) were used to design enrichment probes for 3385 exons from 768 genes (>1.6 Mbp) followed by Illumina sequencing of enriched libraries. Hyb-Seq of 12 individuals (10 Asclepias species and two related genera) resulted in at least partial assembly of 92.6% of exons and 99.7% of genes and an average assembly length >2 Mbp. Importantly, complete plastomes and nuclear ribosomal DNA cistrons were assembled using off-target reads. Phylogenomic analyses demonstrated signal conflict between genomes. • CONCLUSIONS: The Hyb-Seq approach enables targeted sequencing of thousands of low-copy nuclear exons and flanking regions, as well as genome skimming of high-copy repeats and organellar genomes, to efficiently produce genome-scale data sets for phylogenomics.
Department of Botany Oklahoma State University 301 Physical Sciences Stillwater Oklahoma 74078 USA
Institute of Botany Academy of Sciences of the Czech Republic CZ 25243 Průhonice Czech Republic
Zobrazit více v PubMed
Bolger A. M., Lohse M., Usadel B. 2014. Trimmomatic: A flexible trimmer for Illumina Sequence Data. Bioinformatics (Oxford, England) 30: 2114–2120 PubMed PMC
Chapman M. A., Chang J., Weisman D., Kesseli R. V., Burke J. M. 2007. Universal markers for comparative mapping and phylogenetic analysis in the Asteraceae (Compositae). Theoretical and Applied Genetics 115: 747–755 PubMed
Cronn R., Knaus B. J., Liston A., Maughan P. J., Parks M., Syring J. V., Udall J. 2012. Targeted enrichment strategies for next-generation plant biology. American Journal of Botany 99: 291–311 PubMed
Davey J. W., Hohenlohe P. A., Etter P. D., Boone J. Q., Catchen J. M., Blaxter M. L. 2011. Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nature Reviews. Genetics 12: 499–510 PubMed
Doyle J. J., Doyle J. L. 1987. A rapid DNA isolation procedure for small quantities of fresh leaf tissue. Phytochemical Bulletin 19: 11–15
Duarte J. M., Wall P. K., Edger P. P., Landherr L. L., Ma H., Pires J. C., Leebens-Mack J., et al. 2010. Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryza and their phylogenetic utility across various taxonomic levels. BMC Evolutionary Biology 10: 61. PubMed PMC
Eaton D. A. R., Ree R. H. 2013. Inferring phylogeny and introgression using RADseq data: An example from flowering plants (Pedicularis: Orobanchaceae). Systematic Biology 62: 689–706 PubMed PMC
Faircloth B. C. 2014. phyluce Phylogenetic estimation from ultraconserved elements. 10.6079/J9PHYL. GitHub repository https://github.com/faircloth-lab/phyluce [accessed 15 July 2014].
Faircloth B. C., McCormack J. E., Crawford N. G., Harvey M. G., Brumfield R. T., Glenn T. C. 2012. Ultraconserved elements anchor thousands of genetic markers spanning multiple evolutionary timescales. Systematic Biology 61: 717–726 PubMed
Fishbein M., Chuba D., Ellison C., Mason-Gamer R. J., Lynch S. P. 2011. Phylogenetic relationships of Asclepias (Apocynaceae) inferred from non-coding chloroplast DNA sequences. Systematic Botany 36: 1008–1023
Gongora-Castillo E., Childs K. L., Fedewa G., Hamilton J. P., Liscombe D. K., Magallanes-Lundback M., Mandadi K. K., et al. 2012. Development of transcriptomic resources for interrogating the biosynthesis of monoterpene indole alkaloids in medicinal plant species. PLoS ONE 7: e52506. PubMed PMC
Gordon A. 2010. FASTX-Toolkit. Website http://hannonlab.cshl.edu/fastx_toolkit/ [accessed 15 May 2014].
Grabherr M. G., Haas B. J., Yassour M., Levin J. Z., Thompson D. A., Amit I., Adiconis X., et al. 2011. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nature Biotechnology 29: 644–652 PubMed PMC
Jiao Y., Wickett N. J., Ayyampalayam S., Chanderbali A. S., Landherr L., Ralph P. E., Tomsho L. P., et al. 2011. Ancestral polyploidy in seed plants and angiosperms. Nature 473: 97–100 PubMed
Katoh K., Toh H. 2008. Recent developments in the MAFFT multiple sequence alignment program. Briefings in Bioinformatics 9: 286–298 PubMed
Kent W. J. 2002. BLAT—the BLAST-Like Alignment Tool. Genome Research 12: 656–664 PubMed PMC
Knaus B. 2012. Short read toolbox. Website http://brianknaus.com/software/srtoolbox/ [accessed 15 May 2014].
Kubatko L., Degnan J. 2007. Inconsistency of phylogenetic estimates from concatenated data under coalescence. Systematic Biology 56: 17–24 PubMed
Lemmon E. M., Lemmon A. R. 2013. High-throughput genomic data in systematics and phylogenetics. Annual Review of Ecology Evolution and Systematics 44: 99–121
Li C., Hofreiter M., Straube N., Corrigan S., Naylor G. J. 2013. Capturing protein-coding genes across highly divergent species. BioTechniques 54: 321–326 PubMed
Li W., Godzik A. 2006. Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics (Oxford, England) 22: 1658–1659 PubMed
Liu L., Yu L., Edwards S. V. 2010. A maximum pseudo-likelihood approach for estimating species trees under the coalescent model. BMC Evolutionary Biology 10: 302. PubMed PMC
Mandel J. R., Dikow R. B., Funk V. A., Masalia R. R., Staton S. E., Kozik A., Michelmore R. W., et al. 2014. A target enrichment method for gathering phylogenetic information from hundreds of loci: An example from the Compositae. Applications in Plant Sciences 2(2):1300085 PubMed PMC
McCormack J. E., Harvey M. G., Faircloth B. C., Crawford N. G., Glenn T. C., Brumfield R. T. 2013. A phylogeny of birds based on over 1,500 loci collected by target enrichment and high-throughput sequencing. PLoS ONE 8: e54848. PubMed PMC
Murata J., Bienzle D., Brandle J. E., Sensen C. W., De Luca V. 2006. Expressed sequence tags from Madagascar periwinkle (Catharanthus roseus). FEBS Letters 580: 4501–4507 PubMed
Nylander J. A. A. 2011. Catfasta2pyml.pl. Website http://www.abc.se/∼nylander/catfasta2phyml/ [accessed 15 May 2014].
Parks M., Cronn R., Liston A. 2012. Separating the wheat from the chaff: Mitigating the effects of noise in a plastome phylogenomic data set from Pinus L. (Pinaceae). BMC Evolutionary Biology 12: 100. PubMed PMC
Ratan A. 2009. Assembly algorithms for next-generation sequence data. Ph.D. dissertation, The Pennsylvania State University, University Park, Pennsylvania, USA.
Reneker J., Lyons E., Conant G. C., Pires J. C., Freeling M., Shyu C.-R., Korkin D. 2012. Long identical multispecies elements in plant and animal genomes. Proceedings of the National Academy of Sciences, USA 109: E1183–E1191 PubMed PMC
Salichos L., Rokas A. 2013. Inferring ancient divergences requires genes with strong phylogenetic signals. Nature 497: 327–331 PubMed
Shaw T. I., Ruan Z., Glenn T. C., Liu L. 2013. STRAW: Species TRee Analysis Web server. Nucleic Acids Research 41: W238–W241 PubMed PMC
Simpson J. T., Wong K., Jackman S. D., Schein J. E., Jones S. J. M., Birol İ. 2009. ABySS: A parallel assembler for short read sequence data. Genome Research 19: 1117–1123 PubMed PMC
Stamatakis A. 2006. RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics (Oxford, England) 22: 2688–2690 PubMed
Straub S. C. K., Fishbein M., Livshultz T., Foster Z., Parks M., Weitemier K., Cronn R. C., Liston A. 2011. Building a model: Developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing. BMC Genomics 12: 211. PubMed PMC
Straub S. C. K., Parks M., Weitemier K., Fishbein M., Cronn R. C., Liston A. 2012. Navigating the tip of the genomic iceberg: Next-generation sequencing for plant systematics. American Journal of Botany 99: 349–364 PubMed
Stull G. W., Moore M. J., Mandala V. S., Douglas N. A., Kates H.-R., Qi X., Brockington S. F., et al. 2013. A targeted enrichment strategy for massively parallel sequencing of angiosperm plastid genomes. Applications in Plant Sciences 1(2):1200497 PubMed PMC
Tennessen J. A., Govindarajulu R., Liston A., Ashman T.-L. 2013. Targeted sequence capture provides insight into genome structure and genetics of male sterility in a gynodioecious diploid strawberry, Fragaria vesca ssp. bracteata (Rosaceae). G3·Genes|Genomes|Genetics 3: 1341–1351 PubMed PMC
Wu F., Mueller L. A., Crouzillat D., Petiard V., Tanksley S. D. 2006. Combining bioinformatics and phylogenetics to identify large sets of single-copy orthologous genes (COSII) for comparative, evolutionary and systematic studies: A test case in the euasterid plant clade. Genetics 174: 1407–1420 PubMed PMC
Zimmer E. A., Wen J. 2013. Using nuclear gene data for plant phylogenetics: Progress and prospects. Molecular Phylogenetics and Evolution 66: 539–550 PubMed
Allopolyploidy: An Underestimated Driver in Juniperus Evolution
HybPhyloMaker: Target Enrichment Data Analysis From Raw Reads to Species Trees