Recent technological advances have made next-generation sequencing (NGS) a popular and financially accessible technique allowing a broad range of analyses to be done simultaneously. A huge amount of newly generated NGS data, however, require advanced software support to help both in analyzing the data and biologically interpreting the results. In this article, we describe SATrans (Software for Annotation of Transcriptome), a software package providing fast and robust functional annotation of novel sequences obtained from transcriptome sequencing. Moreover, it performs advanced gene ontology analysis of differentially expressed genes, thereby helping to interpret biologically-and in a user-friendly form-the quantitative changes in gene expression. The software is freely available and provides the possibility to work with thousands of sequences using a standard personal computer or notebook running on the Linux operating system.
The tarnished plant bug (TPB), Lygus lineolaris (Palisot de Beauvois) is a polyphagous, phytophagous insect that has emerged as a major pest of cotton, alfalfa, fruits, and vegetable crops in the eastern United States and Canada. Using its piercing-sucking mouthparts, TPB employs a "lacerate and flush" feeding strategy in which saliva injected into plant tissue degrades cell wall components and lyses cells whose contents are subsequently imbibed by the TPB. It is known that a major component of TPB saliva is the polygalacturonase enzymes that degrade the pectin in the cell walls. However, not much is known about the other components of the saliva of this important pest. In this study, we explored the salivary gland transcriptome of TPB using Illumina sequencing. After in silico conversion of RNA sequences into corresponding polypeptides, 25,767 putative proteins were discovered. Of these, 19,540 (78.83%) showed significant similarity to known proteins in the either the NCBI nr or Uniprot databases. Gene ontology (GO) terms were assigned to 7,512 proteins, and 791 proteins in the sialotranscriptome of TPB were found to collectively map to 107 Kyoto Encyclopedia of Genes and Genomes (KEGG) database pathways. A total of 3,653 Pfam domains were identified in 10,421 sialotranscriptome predicted proteins resulting in 12,814 Pfam annotations; some proteins had more than one Pfam domain. Functional annotation revealed a number of salivary gland proteins that potentially facilitate degradation of host plant tissues and mitigation of the host plant defense response. These transcripts/proteins and their potential roles in TPB establishment are described.
- MeSH
- Molecular Sequence Annotation MeSH
- Gene Ontology MeSH
- Heteroptera genetics growth & development metabolism MeSH
- Genes, Insect genetics MeSH
- Salivary Glands metabolism MeSH
- Gene Expression Profiling * MeSH
- Animals MeSH
- Check Tag
- Animals MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
BACKGROUND: Photosynthetic euglenids are major contributors to fresh water ecosystems. Euglena gracilis in particular has noted metabolic flexibility, reflected by an ability to thrive in a range of harsh environments. E. gracilis has been a popular model organism and of considerable biotechnological interest, but the absence of a gene catalogue has hampered both basic research and translational efforts. RESULTS: We report a detailed transcriptome and partial genome for E. gracilis Z1. The nuclear genome is estimated to be around 500 Mb in size, and the transcriptome encodes over 36,000 proteins and the genome possesses less than 1% coding sequence. Annotation of coding sequences indicates a highly sophisticated endomembrane system, RNA processing mechanisms and nuclear genome contributions from several photosynthetic lineages. Multiple gene families, including likely signal transduction components, have been massively expanded. Alterations in protein abundance are controlled post-transcriptionally between light and dark conditions, surprisingly similar to trypanosomatids. CONCLUSIONS: Our data provide evidence that a range of photosynthetic eukaryotes contributed to the Euglena nuclear genome, evidence in support of the 'shopping bag' hypothesis for plastid acquisition. We also suggest that euglenids possess unique regulatory mechanisms for achieving extreme adaptability, through mechanisms of paralog expansion and gene acquisition.
The recent human Monkeypox outbreak underlined the importance of studying basic biology of orthopoxviruses. However, the transcriptome of its causative agent has not been investigated before neither with short-, nor with long-read sequencing approaches. This Oxford Nanopore long-read RNA-Sequencing dataset fills this gap. It will enable the in-depth characterization of the transcriptomic architecture of the monkeypox virus, and may even make possible to annotate novel host transcripts. Moreover, our direct cDNA and native RNA sequencing reads will allow the estimation of gene expression changes of both the virus and the host cells during the infection. Overall, our study will lead to a deeper understanding of the alterations caused by the viral infection on a transcriptome level.
- MeSH
- DNA, Complementary MeSH
- Humans MeSH
- Nanopore Sequencing * MeSH
- Mpox, Monkeypox * MeSH
- Gene Expression Profiling MeSH
- Transcriptome MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Dataset MeSH
BACKGROUND: Ectoparasites from the family Diplozoidae (Platyhelminthes, Monogenea) belong to obligate haematophagous helminths of cyprinid fish. Current knowledge of these worms is for the most part limited to their morphological, phylogenetic, and population features. Information concerning the biochemical and molecular nature of physiological processes involved in host-parasite interaction, such as evasion of the immune system and its regulation, digestion of macromolecules, suppression of blood coagulation and inflammation, and effect on host tissue and physiology, is lacking. In this study, we report for the first time a comprehensive transcriptomic/secretome description of expressed genes and proteins secreted by the adult stage of Eudiplozoon nipponicum (Goto, 1891) Khotenovsky, 1985, an obligate sanguivorous monogenean which parasitises the gills of the common carp (Cyprinus carpio). RESULTS: RNA-seq raw reads (324,941 Roche 454 and 149,697,864 Illumina) were generated, de novo assembled, and filtered into 37,062 protein-coding transcripts. For 19,644 (53.0%) of them, we determined their sequential homologues. In silico functional analysis of E. nipponicum RNA-seq data revealed numerous transcripts, pathways, and GO terms responsible for immunomodulation (inhibitors of proteolytic enzymes, CD59-like proteins, fatty acid binding proteins), feeding (proteolytic enzymes cathepsins B, D, L1, and L3), and development (fructose 1,6-bisphosphatase, ferritin, and annexin). LC-MS/MS spectrometry analysis identified 721 proteins secreted by E. nipponicum with predominantly immunomodulatory and anti-inflammatory functions (peptidyl-prolyl cis-trans isomerase, homolog to SmKK7, tetraspanin) and ability to digest host macromolecules (cathepsins B, D, L1). CONCLUSIONS: In this study, we integrated two high-throughput sequencing techniques, mass spectrometry analysis, and comprehensive bioinformatics approach in order to arrive at the first comprehensive description of monogenean transcriptome and secretome. Exploration of E. nipponicum transcriptome-related nucleotide sequences and translated and secreted proteins offer a better understanding of molecular biology and biochemistry of these, often neglected, organisms. It enabled us to report the essential physiological pathways and protein molecules involved in their interactions with the fish hosts.
BACKGROUND: Prostate cancer is caused by genomic aberrations in normal epithelial cells, however clinical translation of findings from analyses of cancer cells alone has been very limited. A deeper understanding of the tumour microenvironment is needed to identify the key drivers of disease progression and reveal novel therapeutic opportunities. RESULTS: In this study, the experimental enrichment of selected cell-types, the development of a Bayesian inference model for continuous differential transcript abundance, and multiplex immunohistochemistry permitted us to define the transcriptional landscape of the prostate cancer microenvironment along the disease progression axis. An important role of monocytes and macrophages in prostate cancer progression and disease recurrence was uncovered, supported by both transcriptional landscape findings and by differential tissue composition analyses. These findings were corroborated and validated by spatial analyses at the single-cell level using multiplex immunohistochemistry. CONCLUSIONS: This study advances our knowledge concerning the role of monocyte-derived recruitment in primary prostate cancer, and supports their key role in disease progression, patient survival and prostate microenvironment immune modulation.
- MeSH
- Molecular Sequence Annotation MeSH
- Immunophenotyping MeSH
- Immunohistochemistry MeSH
- Kaplan-Meier Estimate MeSH
- Humans MeSH
- Monocytes metabolism pathology MeSH
- Tumor Microenvironment genetics MeSH
- Prostatic Neoplasms diagnosis genetics metabolism mortality MeSH
- Prognosis MeSH
- Disease Progression MeSH
- Gene Expression Profiling * methods MeSH
- Transcriptome * MeSH
- Computational Biology methods MeSH
- High-Throughput Nucleotide Sequencing MeSH
- Check Tag
- Humans MeSH
- Male MeSH
- Publication type
- Journal Article MeSH
Ectropis oblique Prout (Lepidoptera: Geometridae) is one of the main pests that damages the tea crop in Southeast Asia. To understand the molecular mechanisms of its feeding biology, transcriptomes of the alimentary tract (AT) and of the body minus the AT of E. oblique were successfully sequenced and analyzed in this study. A total of 36,950 unigenes from de novo sequences were assembled. After analysis using six annotation databases (e.g., Gene Ontology, Kyoto Encyclopedia of Genes and Genome, and NCBI nr), a series of putative genes were found for this insect species that were related to digestion, detoxification, the immune system, and Bacillus thuringiensis (Bt) receptors. From this series of genes, 21 were randomly selected to verify the relative expression levels of transcripts using quantitative real-time polymerase chain reaction. These results will provide an invaluable genomic resource for future studies on the molecular mechanisms of E. oblique, which will be useful in developing biological control strategies for this pest.
- MeSH
- Larva genetics growth & development MeSH
- Moths genetics growth & development MeSH
- Sequence Analysis, DNA MeSH
- Gene Expression Profiling MeSH
- Transcriptome * MeSH
- Digestive System MeSH
- Animals MeSH
- Check Tag
- Animals MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
BACKGROUND: RNA-sequencing analysis is increasingly utilized to study gene expression in non-model organisms without sequenced genomes. Aethionema arabicum (Brassicaceae) exhibits seed dimorphism as a bet-hedging strategy - producing both a less dormant mucilaginous (M+) seed morph and a more dormant non-mucilaginous (NM) seed morph. Here, we compared de novo and reference-genome based transcriptome assemblies to investigate Ae. arabicum seed dimorphism and to evaluate the reference-free versus -dependent approach for identifying differentially expressed genes (DEGs). RESULTS: A de novo transcriptome assembly was generated using sequences from M+ and NM Ae. arabicum dry seed morphs. The transcripts of the de novo assembly contained 63.1% complete Benchmarking Universal Single-Copy Orthologs (BUSCO) compared to 90.9% for the transcripts of the reference genome. DEG detection used the strict consensus of three methods (DESeq2, edgeR and NOISeq). Only 37% of 1533 differentially expressed de novo assembled transcripts paired with 1876 genome-derived DEGs. Gene Ontology (GO) terms distinguished the seed morphs: the terms translation and nucleosome assembly were overrepresented in DEGs higher in abundance in M+ dry seeds, whereas terms related to mRNA processing and transcription were overrepresented in DEGs higher in abundance in NM dry seeds. DEGs amongst these GO terms included ribosomal proteins and histones (higher in M+), RNA polymerase II subunits and related transcription and elongation factors (higher in NM). Expression of the inferred DEGs and other genes associated with seed maturation (e.g. those encoding late embryogenesis abundant proteins and transcription factors regulating seed development and maturation such as ABI3, FUS3, LEC1 and WRI1 homologs) were put in context with Arabidopsis thaliana seed maturation and indicated that M+ seeds may desiccate and mature faster than NM. The 1901 transcriptomic DEG set GO-terms had almost 90% overlap with the 2191 genome-derived DEG GO-terms. CONCLUSIONS: Whilst there was only modest overlap of DEGs identified in reference-free versus -dependent approaches, the resulting GO analysis was concordant in both approaches. The identified differences in dry seed transcriptomes suggest mechanisms underpinning previously identified contrasts between morphology and germination behaviour of M+ and NM seeds.
- MeSH
- Molecular Sequence Annotation MeSH
- Brassicaceae genetics growth & development MeSH
- Genome, Plant MeSH
- Gene Ontology MeSH
- Germination MeSH
- Gene Expression Regulation, Plant * MeSH
- Plant Proteins genetics MeSH
- Seeds genetics growth & development MeSH
- Gene Expression Profiling MeSH
- Transcriptome * MeSH
- High-Throughput Nucleotide Sequencing MeSH
- Publication type
- Journal Article MeSH
BACKGROUND: The hop plant (Humulus lupulus L.) is a valuable source of several secondary metabolites, such as flavonoids, bitter acids, and essential oils. These compounds are widely implicated in the beer brewing industry and are having potential biomedical applications. Several independent breeding programs around the world have been initiated to develop new cultivars with enriched lupulin and secondary metabolite contents but met with limited success due to several constraints. In the present work, a pioneering attempt has been made to overexpress master regulator binary transcription factor complex formed by HlWRKY1 and HlWDR1 using a plant expression vector to enhance the level of prenylflavonoid and bitter acid content in the hop. Subsequently, we performed transcriptional profiling using high-throughput RNA-Seq technology in leaves of resultant transformants and wild-type hop to gain in-depth information about the genome-wide functional changes induced by HlWRKY1 and HlWDR1 overexpression. RESULTS: The transgenic WW-lines exhibited an elevated expression of structural and regulatory genes involved in prenylflavonoid and bitter acid biosynthesis pathways. In addition, the comparative transcriptome analysis revealed a total of 522 transcripts involved in 30 pathways, including lipids and amino acids biosynthesis, primary carbon metabolism, phytohormone signaling and stress responses were differentially expressed in WW-transformants. It was apparent from the whole transcriptome sequencing that modulation of primary carbon metabolism and other pathways by HlWRKY1 and HlWDR1 overexpression resulted in enhanced substrate flux towards secondary metabolites pathway. The detailed analyses suggested that none of the pathways or genes, which have a detrimental effect on physiology, growth and development processes, were induced on a genome-wide scale in WW-transgenic lines. CONCLUSIONS: Taken together, our results suggest that HlWRKY1 and HlWDR1 simultaneous overexpression positively regulates the prenylflavonoid and bitter acid biosynthesis pathways in the hop and thus these transgenes are presented as prospective candidates for achieving enhanced secondary metabolite content in the hop.
Hazelnut (Corylus), which has high commercial and nutritional benefits, is an important tree for producing nuts and nut oil consumed as ingredient especially in chocolate. While Corylus avellana L. (Euro-pean hazelnut, Betulaceae) and Corylus colurna L. (Turkish hazelnut, Betulaceae) are the two common hazelnut species in Europe, C. avellana L. (Tombul hazelnut) is grown as the most widespread hazelnut species in Turkey, and C. colurna L., which is the most important genetic resource for hazelnut breeding, exists naturally in Anatolia. We generated the transcriptome data of these two Corylus species and used these data for gene discovery and gene expression profiling. Total RNA from young leaves, flowers (male and female), buds, and husk shoots of C. avellana and C. colurna were used for two different libraries and were sequenced using Illumina HiSeq4000 with 100 bp paired-end reads. The transcriptome data 10.48 and 10.30 Gb of C. avellana and C. colurna, respectively, were assembled into 70,265 and 88,343 unigenes, respectively. These unigenes were functionally annotated using the TRAPID platform. We identified 25,312 and 27,051 simple sequen-ce repeats (SSRs) for C. avellana and C. colurna, respectively. TL1, GMPM1, N, 2MMP, At1g29670, CHIB1 unigenes were selected for validation with qPCR. The first de novo transcriptome data of C. co-lurna were used to compare data of C. avellana of commercial importance. These data constitute a valuable extension of the publicly available transcriptomic resource aimed at breeding, medicinal, and industrial research studies.
- MeSH
- Corylus * genetics metabolism MeSH
- Nuts MeSH
- Gene Expression Profiling MeSH
- Publication type
- Journal Article MeSH
- Geographicals
- Turkey MeSH