Usability of reference-free transcriptome assemblies for detection of differential expression: a case study on Aethionema arabicum dimorphic seeds

. 2019 Jan 30 ; 20 (1) : 95. [epub] 20190130

Jazyk angličtina Země Anglie, Velká Británie Médium electronic

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/pmid30700268

Grantová podpora
RE 1697/8-1 Deutsche Forschungsgemeinschaft
849.13.004 Netherlands Organization for International Cooperation in Higher Education
BB/M00192X/1 BB/M000583/1 Biotechnology and Biological Sciences Research Council - United Kingdom
NE/L002485/1 Natural Environment Research Council

Odkazy

PubMed 30700268
PubMed Central PMC6354389
DOI 10.1186/s12864-019-5452-4
PII: 10.1186/s12864-019-5452-4
Knihovny.cz E-zdroje

BACKGROUND: RNA-sequencing analysis is increasingly utilized to study gene expression in non-model organisms without sequenced genomes. Aethionema arabicum (Brassicaceae) exhibits seed dimorphism as a bet-hedging strategy - producing both a less dormant mucilaginous (M+) seed morph and a more dormant non-mucilaginous (NM) seed morph. Here, we compared de novo and reference-genome based transcriptome assemblies to investigate Ae. arabicum seed dimorphism and to evaluate the reference-free versus -dependent approach for identifying differentially expressed genes (DEGs). RESULTS: A de novo transcriptome assembly was generated using sequences from M+ and NM Ae. arabicum dry seed morphs. The transcripts of the de novo assembly contained 63.1% complete Benchmarking Universal Single-Copy Orthologs (BUSCO) compared to 90.9% for the transcripts of the reference genome. DEG detection used the strict consensus of three methods (DESeq2, edgeR and NOISeq). Only 37% of 1533 differentially expressed de novo assembled transcripts paired with 1876 genome-derived DEGs. Gene Ontology (GO) terms distinguished the seed morphs: the terms translation and nucleosome assembly were overrepresented in DEGs higher in abundance in M+ dry seeds, whereas terms related to mRNA processing and transcription were overrepresented in DEGs higher in abundance in NM dry seeds. DEGs amongst these GO terms included ribosomal proteins and histones (higher in M+), RNA polymerase II subunits and related transcription and elongation factors (higher in NM). Expression of the inferred DEGs and other genes associated with seed maturation (e.g. those encoding late embryogenesis abundant proteins and transcription factors regulating seed development and maturation such as ABI3, FUS3, LEC1 and WRI1 homologs) were put in context with Arabidopsis thaliana seed maturation and indicated that M+ seeds may desiccate and mature faster than NM. The 1901 transcriptomic DEG set GO-terms had almost 90% overlap with the 2191 genome-derived DEG GO-terms. CONCLUSIONS: Whilst there was only modest overlap of DEGs identified in reference-free versus -dependent approaches, the resulting GO analysis was concordant in both approaches. The identified differences in dry seed transcriptomes suggest mechanisms underpinning previously identified contrasts between morphology and germination behaviour of M+ and NM seeds.

Zobrazit více v PubMed

Brautigam A, Gowik U. What can next generation sequencing do for you? Next generation sequencing as a valuable tool in plant research. Plant Biol (Stuttg) 2010;12(6):831–841. doi: 10.1111/j.1438-8677.2010.00373.x. PubMed DOI

Mohammadin S, Peterse K, van de Kerke SJ, Chatrou LW, Donmez AA, Mummenhoff K, Pires JC, Edger PP, Al-Shehbaz IA, Schranz ME. Anatolian origins and diversification of Aethionema, the sister lineage of the core Brassicaceae. Am J Bot. 2017;104(7):1042–1054. doi: 10.3732/ajb.1700091. PubMed DOI

Lenser T, Graeber K, Cevik OS, Adiguzel N, Donmez AA, Grosche C, Kettermann M, Mayland-Quellhorst S, Merai Z, Mohammadin S, et al. Developmental control and plasticity of fruit and seed dimorphism in Aethionema arabicum. Plant Physiol. 2016;172(3):1691–1707. doi: 10.1104/pp.16.00838. PubMed DOI PMC

Arshad W, Sperber K, Steinbrecher T, Nichols B, Jansen VAA, Leubner-Metzger G, Mummenhoff K. Dispersal biophysics and adaptive significance of dimorphic diaspores in the annual Aethionema arabicum (Brassicaceae). New Phytol. 2019;221(3):1434–46. 10.1111/nph.15490. Epub 2018 Oct 25. PubMed PMC

Haudry A, Platts AE, Vello E, Hoen DR, Leclercq M, Williamson RJ, Forczek E, Joly-Lopez Z, Steffen JG, Hazzouri KM, et al. An atlas of over 90,000 conserved noncoding sequences provides insight into crucifer regulatory regions. Nat Genet. 2013;45(8):891–U228. doi: 10.1038/ng.2684. PubMed DOI

t Hoen PA, Ariyurek Y, Thygesen HH, Vreugdenhil E, Vossen RH, de Menezes RX, Boer JM, van Ommen GJ, den Dunnen JT. Deep sequencing-based expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms. Nucleic Acids Res. 2008;36(21):e141. doi: 10.1093/nar/gkn705. PubMed DOI PMC

Wilhelm BT, Marguerat S, Watt S, Schubert F, Wood V, Goodhead I, Penkett CJ, Rogers J, Bahler J. Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution. Nature. 2008;453(7199):1239–1243. doi: 10.1038/nature07002. PubMed DOI

Gibbons JG, Janson EM, Hittinger CT, Johnston M, Abbot P, Rokas A. Benchmarking next-generation transcriptome sequencing for functional and evolutionary genomics. Mol Biol Evol. 2009;26(12):2731–2744. doi: 10.1093/molbev/msp188. PubMed DOI

Conesa A, Madrigal P, Tarazona S, Gomez-Cabrero D, Cervera A, McPherson A, Szczesniak MW, Gaffney DJ, Elo LL, Zhang X, et al. A survey of best practices for RNA-seq data analysis. Genome Biol. 2016;17(1):13. doi: 10.1186/s13059-016-0881-8. PubMed DOI PMC

Gongora-Castillo E, Buell CR. Bioinformatics challenges in de novo transcriptome assembly using short read sequences in the absence of a reference genome sequence. Nat Prod Rep. 2013;30(4):490–500. doi: 10.1039/c3np20099j. PubMed DOI

Li B, Fillmore N, Bai Y, Collins M, Thomson JA, Stewart R, Dewey CN. Evaluation of de novo transcriptome assemblies from RNA-Seq data. Genome Biol. 2014;15(12):553. doi: 10.1186/s13059-014-0553-5. PubMed DOI PMC

Surget-Groba Y, Montoya-Burgos JI. Optimization of de novo transcriptome assembly from next-generation sequencing data. Genome Res. 2010;20(10):1432–1440. doi: 10.1101/gr.103846.109. PubMed DOI PMC

Zhao QY, Wang Y, Kong YM, Luo D, Li X, Hao P. Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study. BMC Bioinformatics. 2011;12(Suppl 14):S2. doi: 10.1186/1471-2105-12-S14-S2. PubMed DOI PMC

Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011;29(7):644–652. doi: 10.1038/nbt.1883. PubMed DOI PMC

Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–2120. doi: 10.1093/bioinformatics/btu170. PubMed DOI PMC

Schmieder R, Edwards R. Quality control and preprocessing of metagenomic datasets. Bioinformatics. 2011;27(6):863–864. doi: 10.1093/bioinformatics/btr026. PubMed DOI PMC

Wu TD, Nacu S. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics. 2010;26(7):873–881. doi: 10.1093/bioinformatics/btq057. PubMed DOI PMC

Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139–140. doi: 10.1093/bioinformatics/btp616. PubMed DOI PMC

Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550. doi: 10.1186/s13059-014-0550-8. PubMed DOI PMC

Tarazona S, Garcia-Alcalde F, Dopazo J, Ferrer A, Conesa A. Differential expression in RNA-seq: a matter of depth. Genome Res. 2011;21(12):2213–2223. doi: 10.1101/gr.124321.111. PubMed DOI PMC

Conesa A, Gotz S, Garcia-Gomez JM, Terol J, Talon M, Robles M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005;21(18):3674–3676. doi: 10.1093/bioinformatics/bti610. PubMed DOI

Liu Y, Zhou J, White KP. RNA-seq differential expression studies: more sequence or more replication? Bioinformatics. 2014;30(3):301–304. doi: 10.1093/bioinformatics/btt688. PubMed DOI PMC

Simao FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015;31(19):3210–3212. doi: 10.1093/bioinformatics/btv351. PubMed DOI

Berardini TZ, Reiser L, Li D, Mezheritsky Y, Muller R, Strait E, Huala E. The Arabidopsis information resource: making and mining the “gold standard” annotated reference plant genome. Genesis. 2015;53(8):474–485. doi: 10.1002/dvg.22877. PubMed DOI PMC

Anders S, Pyl PT, Huber W. HTSeq--a Python framework to work with high-throughput sequencing data. Bioinformatics. 2015;31(2):166–169. doi: 10.1093/bioinformatics/btu638. PubMed DOI PMC

Wilhelmsson PKI, Muhlich C, Ullrich KK, Rensing SA. Comprehensive genome-wide classification reveals that many plant-specific transcription factors evolved in streptophyte algae. Genome Biol Evol. 2017;9(12):3384–3397. doi: 10.1093/gbe/evx258. PubMed DOI PMC

Zhang ZH, Jhaveri DJ, Marshall VM, Bauer DC, Edson J, Narayanan RK, Robinson GJ, Lundberg AE, Bartlett PF, Wray NR, et al. A comparative study of techniques for differential expression analysis on RNA-Seq data. PLoS One. 2014;9(8):e103207. doi: 10.1371/journal.pone.0103207. PubMed DOI PMC

Perroud PF, Haas FB, Hiss M, Ullrich KK, Alboresi A, Amirebrahimi M, Barry K, Bassi R, Bonhomme S, Chen H, et al. The Physcomitrella patens gene atlas project: large scale RNA-seq based expression data. Plant J. 2018;95:168. doi: 10.1111/tpj.13940. PubMed DOI

Le BH, Cheng C, Bui AQ, Wagmaister JA, Henry KF, Pelletier J, Kwong L, Belmonte M, Kirkbride R, Horvath S, et al. Global analysis of gene activity during Arabidopsis seed development and identification of seed-specific transcription factors. Proc Natl Acad Sci U S A. 2010;107(18):8063–8070. doi: 10.1073/pnas.1003530107. PubMed DOI PMC

Xiang D, Venglat P, Tibiche C, Yang H, Risseeuw E, Cao Y, Babic V, Cloutier M, Keller W, Wang E, et al. Genome-wide analysis reveals gene expression and metabolic network dynamics during embryo development in Arabidopsis. Plant Physiol. 2011;156(1):346–356. doi: 10.1104/pp.110.171702. PubMed DOI PMC

Nakabayashi K, Okamoto M, Koshiba T, Kamiya Y, Nambara E. Genome-wide profiling of stored mRNA in Arabidopsis thaliana seed germination: epigenetic and genetic regulation of transcription in seed. Plant J. 2005;41(5):697–709. doi: 10.1111/j.1365-313X.2005.02337.x. PubMed DOI

Graeber K, Nakabayashi K, Leubner-Metzger G. Seed development and germination. In: Thomas B, Murray BG, Murphy DJ, editors. Encylopedia of applied plant sciences. Waltham: Academic Press; 2017. pp. 483–489.

Leprince O, Pellizzaro A, Berriri S, Buitink J. Late seed maturation: drying without dying. J Exp Bot. 2017;68(4):827–841. PubMed

Fei H, Tsang E, Cutler AJ. Gene expression during seed maturation in Brassica napus in relation to the induction of secondary dormancy. Genomics. 2007;89(3):419–428. doi: 10.1016/j.ygeno.2006.11.008. PubMed DOI

Bai B, Peviani A, van der Horst S, Gamm M, Snel B, Bentsink L, Hanson J. Extensive translational regulation during seed germination revealed by polysomal profiling. New Phytol. 2017;214(1):233–244. doi: 10.1111/nph.14355. PubMed DOI PMC

Galland M, Rajjou L. Regulation of mRNA translation controls seed germination and is critical for seedling vigor. Front Plant Sci. 2015;6:284. doi: 10.3389/fpls.2015.00284. PubMed DOI PMC

Tatematsu K, Kamiya Y, Nambara E. Co-regulation of ribosomal protein genes as an indicator of growth status: comparative transcriptome analysis on axillary shoots and seeds in Arabidopsis. Plant Signal Behav. 2008;3(7):450–452. doi: 10.4161/psb.3.7.5577. PubMed DOI PMC

Xiao J, Jin R, Wagner D. Developmental transitions: integrating environmental cues with hormonal signaling in the chromatin landscape in plants. Genome Biol. 2017;18:88. doi: 10.1186/s13059-017-1228-9. PubMed DOI PMC

Bonisch C, Hake SB. Histone H2A variants in nucleosomes and chromatin: more or less stable? Nucleic Acids Res. 2012;40(21):10719–10741. doi: 10.1093/nar/gks865. PubMed DOI PMC

Boisnard-Lorig C, Colon-Carmona A, Bauch M, Hodge S, Doerner P, Bancharel E, Dumas C, Haseloff J, Berger F. Dynamic analyses of the expression of the HISTONE::YFP fusion protein in arabidopsis show that syncytial endosperm is divided in mitotic domains. Plant Cell. 2001;13(3):495–509. doi: 10.1105/tpc.13.3.495. PubMed DOI PMC

van Zanten M, Koini MA, Geyer R, Liu Y, Brambilla V, Bartels D, Koornneef M, Fransz P, Soppe WJ. Seed maturation in Arabidopsis thaliana is characterized by nuclear size reduction and increased chromatin condensation. Proc Natl Acad Sci U S A. 2011;108(50):20219–20224. doi: 10.1073/pnas.1117726108. PubMed DOI PMC

Yelagandula R, Stroud H, Holec S, Zhou K, Feng S, Zhong X, Muthurajan UM, Nie X, Kawashima T, Groth M, et al. The histone variant H2A.W defines heterochromatin and promotes chromatin condensation in Arabidopsis. Cell. 2014;158(1):98–109. doi: 10.1016/j.cell.2014.06.006. PubMed DOI PMC

Footitt S, Muller K, Kermode AR, Finch-Savage WE. Seed dormancy cycling in Arabidopsis: chromatin remodelling and regulation of DOG1 in response to seasonal environmental signals. Plant J. 2015;81(3):413–425. doi: 10.1111/tpj.12735. PubMed DOI PMC

Liu Y, Koornneef M, Soppe WJJ. The absence of histone H2B monoubiquitination in the Arabidopsis hub1 (rod4) mutant reveals a role for chromatin remodeling in seed dormancy. Plant Cell. 2007;19:433–444. doi: 10.1105/tpc.106.049221. PubMed DOI PMC

Lee N, Kang H, Lee D, Choi G. A histone methyltransferase inhibits seed germination by increasing PIF1 mRNA expression in imbibed seeds. Plant J. 2014;78(2):282–293. doi: 10.1111/tpj.12467. PubMed DOI

Zhou Y, Tan B, Luo M, Li Y, Liu C, Chen C, Yu CW, Yang S, Dong S, Ruan J, et al. HISTONE DEACETYLASE19 interacts with HSL1 and participates in the repression of seed maturation genes in Arabidopsis seedlings. Plant Cell. 2013;25(1):134–148. doi: 10.1105/tpc.112.096313. PubMed DOI PMC

Heisel TJ, Li CY, Grey KM, Gibson SI. Mutations in HISTONE ACETYLTRANSFERASE1 affect sugar response and gene expression in Arabidopsis. Front Plant Sci. 2013;4:245. doi: 10.3389/fpls.2013.00245. PubMed DOI PMC

Graeber K, Nakabayashi K, Miatton E, Leubner-Metzger G, Soppe WJ. Molecular mechanisms of seed dormancy. Plant Cell Environ. 2012;35:1769–1786. doi: 10.1111/j.1365-3040.2012.02542.x. PubMed DOI

Lolas IB, Himanen K, Gronlund JT, Lynggaard C, Houben A, Melzer M, Van Lijsebettens M, Grasser KD. The transcript elongation factor FACT affects Arabidopsis vegetative and reproductive development and genetically interacts with HUB1/2. Plant J. 2010;61(4):686–697. doi: 10.1111/j.1365-313X.2009.04096.x. PubMed DOI

Antosz W, Pfab A, Ehrnsberger HF, Holzinger P, Kollen K, Mortensen SA, Bruckmann A, Schubert T, Langst G, Griesenbeck J, et al. The composition of the Arabidopsis RNA polymerase II transcript elongation complex reveals the interplay between elongation and mRNA processing factors. Plant Cell. 2017;29(4):854–870. doi: 10.1105/tpc.16.00735. PubMed DOI PMC

Wang Y, Ma H. Step-wise and lineage-specific diversification of plant RNA polymerase genes and origin of the largest plant-specific subunits. New Phytol. 2015;207(4):1198–1212. doi: 10.1111/nph.13432. PubMed DOI

Eom H, Park SJ, Kim MK, Kim H, Kang H, Lee I. TAF15b, involved in the autonomous pathway for flowering, represses transcription of FLOWERING LOCUS C. Plant J. 2018;93(1):79–91. doi: 10.1111/tpj.13758. PubMed DOI

Liu Y, Geyer R, van Zanten M, Carles A, Li Y, Horold A, van Nocker S, Soppe WJ. Identification of the Arabidopsis REDUCED DORMANCY 2 gene uncovers a role for the polymerase associated factor 1 complex in seed dormancy. PLoS One. 2011;6(7):e22241. doi: 10.1371/journal.pone.0022241. PubMed DOI PMC

Li S, Chen L, Zhang L, Li X, Liu Y, Wu Z, Dong F, Wan L, Liu K, Hong D, et al. BnaC9.SMG7b functions as a positive regulator of the number of seeds per silique in Brassica napus by regulating the formation of functional female gametophytes. Plant Physiol. 2015;169(4):2744–2760. PubMed PMC

Baud S, Boutin J-P, Miquel M, Lepiniec L, Rochat C. An integrated overview of seed development in Arabidopsis thaliana ecotype WS. Plant Physiol Biochem. 2002;40:151–160. doi: 10.1016/S0981-9428(01)01350-X. DOI

Baud S, Wuilleme S, To A. Rochat C, Lepiniec L. Role of WRINKLED1 in the transcriptional regulation of glycolytic and fatty acid biosynthetic genes in Arabidopsis. Plant J. 2009;60(6):933–947. doi: 10.1111/j.1365-313X.2009.04011.x. PubMed DOI

Miquel M, Trigui G, d'Andrea S, Kelemen Z, Baud S, Berger A, Deruyffelaere C, Trubuil A, Lepiniec L, Dubreucq B. Specialization of oleosins in oil body dynamics during seed development in Arabidopsis seeds. Plant Physiol. 2014;164(4):1866–1878. doi: 10.1104/pp.113.233262. PubMed DOI PMC

Ruuska SA, Girke T, Benning C, Ohlrogge JB. Contrapuntal networks of gene expression during Arabidopsis seed filling. Plant Cell. 2002;14(6):1191–1206. doi: 10.1105/tpc.000877. PubMed DOI PMC

Baud S, Lepiniec L. Physiological and developmental regulation of seed oil production. Prog Lipid Res. 2010;49(3):235–249. doi: 10.1016/j.plipres.2010.01.001. PubMed DOI

Devic M, Roscoe T. Seed maturation: simplification of control networks in plants. Plant Sci. 2016;252:335–346. doi: 10.1016/j.plantsci.2016.08.012. PubMed DOI

Cernac A, Andre C, Hoffmann-Benning S, Benning C. WRI1 is required for seed germination and seedling establishment. Plant Physiol. 2006;141(2):745–757. doi: 10.1104/pp.106.079574. PubMed DOI PMC

Li Z, Wu S, Chen J, Wang X, Gao J, Ren G, Kuai B. NYEs/SGRs-mediated chlorophyll degradation is critical for detoxification during seed maturation in Arabidopsis. Plant J. 2017;92(4):650–661. doi: 10.1111/tpj.13710. PubMed DOI

Sano N, Rajjou L, North HM, Debeaujon I, Marion-Poll A, Seo M. Staying alive: molecular aspects of seed longevity. Plant Cell Physiol. 2016;57(4):660–674. doi: 10.1093/pcp/pcv186. PubMed DOI

Hundertmark M, Hincha DK. LEA (late embryogenesis abundant) proteins and their encoding genes in Arabidopsis thaliana. BMC Genomics. 2008;9:118. doi: 10.1186/1471-2164-9-118. PubMed DOI PMC

Kimura M, Nambara E. Stored and neosynthesized mRNA in Arabidopsis seeds: effects of cycloheximide and controlled deterioration treatment on the resumption of transcription during imbibition. Plant Mol Biol. 2010;73(1–2):119–129. doi: 10.1007/s11103-010-9603-x. PubMed DOI

Manfre AJ, LaHatte GA, Climer CR, Marcotte WR., Jr Seed dehydration and the establishment of desiccation tolerance during seed maturation is altered in the Arabidopsis thaliana mutant atem6-1. Plant Cell Physiol. 2009;50(2):243–253. doi: 10.1093/pcp/pcn185. PubMed DOI

Bailly C. Active oxygen species and antioxidants in seed biology. Seed Sci Res. 2004;14:93–107. doi: 10.1079/SSR2004159. DOI

Linkies A, Leubner-Metzger G. Beyond gibberellins and abscisic acid: how ethylene and jasmonates control seed germination. Plant Cell Rep. 2012;31(2):253–270. doi: 10.1007/s00299-011-1180-1. PubMed DOI

Graeber K, Linkies A, Wood AT, Leubner-Metzger G. A guideline to family-wide comparative state-of-the-art quantitative RT-PCR analysis exemplified with a Brassicaceae cross-species seed germination case study. Plant Cell. 2011;23(6):2045–2063. doi: 10.1105/tpc.111.084103. PubMed DOI PMC

Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389–3402. doi: 10.1093/nar/25.17.3389. PubMed DOI PMC

R: A language and environment for statistical computing. https://www.r-project.org/.

Benjamini Y, Hochberg Y. Controlling the false discovery rate - a practical and powerful approach to multiple testing. J R Stat Soc Ser B Methodol. 1995;57(1):289–300.

Widiez T, Symeonidi A, Luo C, Lam E, Lawton M, Rensing SA. The chromatin landscape of the moss Physcomitrella patens and its dynamics during development and drought stress. Plant J. 2014;79(1):67–81. doi: 10.1111/tpj.12542. PubMed DOI

Winter D, Vinegar B, Nahal H, Ammar R, Wilson GV, Provart NJ. An “electronic fluorescent pictograph” browser for exploring and analyzing large-scale biological data sets. PLoS One. 2007;2(8):e718. doi: 10.1371/journal.pone.0000718. PubMed DOI PMC

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...