Recent technological advances have made next-generation sequencing (NGS) a popular and financially accessible technique allowing a broad range of analyses to be done simultaneously. A huge amount of newly generated NGS data, however, require advanced software support to help both in analyzing the data and biologically interpreting the results. In this article, we describe SATrans (Software for Annotation of Transcriptome), a software package providing fast and robust functional annotation of novel sequences obtained from transcriptome sequencing. Moreover, it performs advanced gene ontology analysis of differentially expressed genes, thereby helping to interpret biologically-and in a user-friendly form-the quantitative changes in gene expression. The software is freely available and provides the possibility to work with thousands of sequences using a standard personal computer or notebook running on the Linux operating system.
- Keywords
- differentially expressed genes, functional annotation, transcriptome,
- MeSH
- Molecular Sequence Annotation methods MeSH
- Humans MeSH
- Sequence Analysis, RNA methods MeSH
- Software * MeSH
- Gene Expression Profiling methods MeSH
- Transcriptome * MeSH
- Animals MeSH
- Check Tag
- Humans MeSH
- Animals MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
BACKGROUND: RNA-seq followed by de novo transcriptome assembly has been a transformative technique in biological research of non-model organisms, but the computational processing of RNA-seq data entails many different software tools. The complexity of these de novo transcriptomics workflows therefore presents a major barrier for researchers to adopt best-practice methods and up-to-date versions of software. RESULTS: Here we present a streamlined and universal de novo transcriptome assembly and annotation pipeline, transXpress, implemented in Snakemake. transXpress supports two popular assembly programs, Trinity and rnaSPAdes, and allows parallel execution on heterogeneous cluster computing hardware. CONCLUSIONS: transXpress simplifies the use of best-practice methods and up-to-date software for de novo transcriptome assembly, and produces standardized output files that can be mined using SequenceServer to facilitate rapid discovery of new genes and proteins in non-model organisms.
- Keywords
- De novo transcriptome assembly, Differential expression analysis, High-performance computing, Non-model organisms, RNA-seq, Reproducible software, Transcriptome annotation,
- MeSH
- Molecular Sequence Annotation MeSH
- Sequence Analysis, RNA methods MeSH
- RNA-Seq MeSH
- Software * MeSH
- Gene Expression Profiling MeSH
- Transcriptome * MeSH
- Publication type
- Journal Article MeSH
The early evolution of eukaryotes and their adaptations to low-oxygen environments are fascinating open questions in biology. Genome-scale data from novel eukaryotes, and particularly from free-living lineages, are the key to answering these questions. The Parabasalia are a major group of anaerobic eukaryotes that form the most speciose lineage of Metamonada. The most well-studied are parasitic parabasalids, including Trichomonas vaginalis and Tritrichomonas foetus, but very little genome-scale data are available for free-living members of the group. Here, we sequenced the transcriptome of Pseudotrichomonas keilini, a free-living parabasalian. Comparative genomic analysis indicated that P. keilini possesses a metabolism and gene complement that are in many respects similar to its parasitic relative T. vaginalis and that in the time since their most recent common ancestor, it is the T. vaginalis lineage that has experienced more genomic change, likely due to the transition to a parasitic lifestyle. Features shared between P. keilini and T. vaginalis include a hydrogenosome (anaerobic mitochondrial homolog) that we predict to function much as in T. vaginalis and a complete glycolytic pathway that is likely to represent one of the primary means by which P. keilini obtains ATP. Phylogenomic analysis indicates that P. keilini branches within a clade of endobiotic parabasalids, consistent with the hypothesis that different parabasalid lineages evolved toward parasitic or free-living lifestyles from an endobiotic, anaerobic, or microaerophilic common ancestor.
Hazelnut (Corylus), which has high commercial and nutritional benefits, is an important tree for producing nuts and nut oil consumed as ingredient especially in chocolate. While Corylus avellana L. (Euro-pean hazelnut, Betulaceae) and Corylus colurna L. (Turkish hazelnut, Betulaceae) are the two common hazelnut species in Europe, C. avellana L. (Tombul hazelnut) is grown as the most widespread hazelnut species in Turkey, and C. colurna L., which is the most important genetic resource for hazelnut breeding, exists naturally in Anatolia. We generated the transcriptome data of these two Corylus species and used these data for gene discovery and gene expression profiling. Total RNA from young leaves, flowers (male and female), buds, and husk shoots of C. avellana and C. colurna were used for two different libraries and were sequenced using Illumina HiSeq4000 with 100 bp paired-end reads. The transcriptome data 10.48 and 10.30 Gb of C. avellana and C. colurna, respectively, were assembled into 70,265 and 88,343 unigenes, respectively. These unigenes were functionally annotated using the TRAPID platform. We identified 25,312 and 27,051 simple sequen-ce repeats (SSRs) for C. avellana and C. colurna, respectively. TL1, GMPM1, N, 2MMP, At1g29670, CHIB1 unigenes were selected for validation with qPCR. The first de novo transcriptome data of C. co-lurna were used to compare data of C. avellana of commercial importance. These data constitute a valuable extension of the publicly available transcriptomic resource aimed at breeding, medicinal, and industrial research studies.
- Keywords
- Corylus spp., RNA-seq, de novo, hazelnut, transcriptome,
- MeSH
- Corylus * genetics metabolism MeSH
- Nuts MeSH
- Gene Expression Profiling MeSH
- Publication type
- Journal Article MeSH
- Geographicals
- Turkey MeSH
Ectropis oblique Prout (Lepidoptera: Geometridae) is one of the main pests that damages the tea crop in Southeast Asia. To understand the molecular mechanisms of its feeding biology, transcriptomes of the alimentary tract (AT) and of the body minus the AT of E. oblique were successfully sequenced and analyzed in this study. A total of 36,950 unigenes from de novo sequences were assembled. After analysis using six annotation databases (e.g., Gene Ontology, Kyoto Encyclopedia of Genes and Genome, and NCBI nr), a series of putative genes were found for this insect species that were related to digestion, detoxification, the immune system, and Bacillus thuringiensis (Bt) receptors. From this series of genes, 21 were randomly selected to verify the relative expression levels of transcripts using quantitative real-time polymerase chain reaction. These results will provide an invaluable genomic resource for future studies on the molecular mechanisms of E. oblique, which will be useful in developing biological control strategies for this pest.
- MeSH
- Larva genetics growth & development MeSH
- Moths genetics growth & development MeSH
- Sequence Analysis, DNA MeSH
- Gene Expression Profiling MeSH
- Transcriptome * MeSH
- Digestive System MeSH
- Animals MeSH
- Check Tag
- Animals MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
Although our knowledge regarding oocyte quality and development has improved significantly, the molecular mechanisms that regulate and determine oocyte developmental competence are still unclear. Therefore, the objective of this study was to identify and analyze the transcriptome profiles of porcine oocytes derived from large or small follicles using RNA high-throughput sequencing technology. RNA libraries were constructed from oocytes of large (LO; 3-6 mm) or small (SO; 1.5-1.9 mm) ovarian follicles and then sequenced in an Illumina HiSeq4000. Transcriptome analysis showed a total of 14,557 genes were commonly detected in both oocyte groups. Genes related to the cell cycle, oocyte meiosis, and quality were among the top highly expressed genes in both groups. Differential expression analysis revealed 60 up- and 262 downregulated genes in the LO compared with the SO group. BRCA2, GPLD1, ZP3, ND3, and ND4L were among the highly abundant and highly significant differentially expressed genes (DEGs). The ontological classification of DEGs indicated that protein processing in endoplasmic reticulum was the top enriched pathway. In addition, biological processes related to cell growth and signaling, gene expression regulations, cytoskeleton, and extracellular matrix organization were among the highly enriched processes. In conclusion, this study provides new insights into the global transcriptome changes and the abundance of specific transcripts in porcine oocytes in correlation with follicle size.
- Keywords
- RNAseq, follicular size, oocyte, porcine,
- MeSH
- Gene Regulatory Networks physiology MeSH
- Oocytes metabolism MeSH
- Oogenesis genetics MeSH
- Ovarian Follicle cytology MeSH
- Reverse Transcriptase Polymerase Chain Reaction MeSH
- Swine genetics growth & development MeSH
- Signal Transduction genetics MeSH
- Gene Expression Profiling MeSH
- Transcriptome * MeSH
- High-Throughput Nucleotide Sequencing MeSH
- Gene Expression Regulation, Developmental physiology MeSH
- Animals MeSH
- Check Tag
- Female MeSH
- Animals MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
BACKGROUND: The haploid male gametophyte generation of flowering plants consists of two- or three-celled pollen grains. This functional specialization is thought to be a key factor in the evolutionary success of flowering plants. Moreover, pollen ontogeny is also an attractive model in which to dissect cellular networks that control cell growth, asymmetric cell division and cellular differentiation. Our objective, and an essential step towards the detailed understanding of these processes, was to comprehensively define the male haploid transcriptome throughout development. RESULTS: We have developed staged spore isolation procedures for Arabidopsis and used Affymetrix ATH1 genome arrays to identify a total of 13,977 male gametophyte-expressed mRNAs, 9.7% of which were male-gametophyte-specific. The transition from bicellular to tricellular pollen was accompanied by a decline in the number of diverse mRNA species and an increase in the proportion of male gametophyte-specific transcripts. Expression profiles of regulatory proteins and distinct clusters of coexpressed genes were identified that could correspond to components of gametophytic regulatory networks. Moreover, integration of transcriptome and experimental data revealed the early synthesis of translation factors and their requirement to support pollen tube growth. CONCLUSIONS: The progression from proliferating microspores to terminally differentiated pollen is characterized by large-scale repression of early program genes and the activation of a unique late gene-expression program in maturing pollen. These data provide a quantum increase in knowledge concerning gametophytic transcription and lay the foundations for new genomic-led studies of the regulatory networks and cellular functions that operate to specify male gametophyte development.
- MeSH
- Arabidopsis genetics MeSH
- Genes, cdc MeSH
- Transcription, Genetic genetics MeSH
- Genome, Plant MeSH
- Haploidy * MeSH
- Homeodomain Proteins genetics MeSH
- Arabidopsis Proteins genetics MeSH
- Pollen genetics MeSH
- Genes, Plant genetics MeSH
- Oligonucleotide Array Sequence Analysis methods MeSH
- Spores genetics MeSH
- Gene Expression Profiling methods MeSH
- Transcription Factors genetics MeSH
- Gene Expression Regulation, Developmental genetics MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Names of Substances
- Homeodomain Proteins MeSH
- KNAT5 protein, Arabidopsis MeSH Browser
- Arabidopsis Proteins MeSH
- Transcription Factors MeSH
Staphylococcus aureus is a common biofilm-forming pathogen. Low doses of disinfectants have previously been reported to promote biofilm formation and to increase virulence. The aim of this study was to use transcriptome sequencing (RNA-seq) analysis to investigate global transcriptional changes in S. aureus in response to sublethal concentrations of the commonly used food industry disinfectants ethanol (EtOH) and chloramine T (ChT) and their combination (EtOH_ChT) in order to better understand the effects of these agents on biofilm formation. Treatment with EtOH and EtOH_ChT resulted in more significantly altered expression profiles than treatment with ChT. Our results revealed that EtOH and EtOH_ChT treatments enhanced the expression of genes responsible for regulation of gene expression (sigB), cell surface factors (clfAB), adhesins (sdrDE), and capsular polysaccharides (cap8EFGL), resulting in more intact biofilm. In addition, in this study we were able to identify the pathways involved in the adaptation of S. aureus to the stress of ChT treatment. Further, EtOH suppressed the effect of ChT on gene expression when these agents were used together at sublethal concentrations. These data show that in the presence of sublethal concentrations of tested disinfectants, S. aureus cells trigger protective mechanisms and try to cope with them.IMPORTANCE So far, the effect of disinfectants is not satisfactorily explained. The presented data will allow a better understanding of the mode of disinfectant action with regard to biofilm formation and the ability of bacteria to survive the treatment. Such an understanding could contribute to the effort to eliminate possible sources of bacteria, making disinfectant application as efficient as possible. Biofilm formation plays significant role in the spread and pathogenesis of bacterial species.
- Keywords
- bacterial decontamination, biofilm formation, gene expression,
- MeSH
- Biofilms drug effects MeSH
- Chloramines pharmacology MeSH
- Disinfectants pharmacology MeSH
- Ethanol pharmacology MeSH
- Sequence Analysis, RNA MeSH
- Gene Expression Profiling MeSH
- Staphylococcus aureus drug effects genetics physiology MeSH
- Transcriptome * MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- chloramine MeSH Browser
- Chloramines MeSH
- Disinfectants MeSH
- Ethanol MeSH
Having different number if genome copies affect transcription and metabolite production of plants. This may be due to different gene transcription and protein expression, but the reasons for this remains poorly known. Here we measured flavonoid content in leaves of three haploid and diploid grafted plants of Ginkgo biloba, a model gymnosperm important economically for its flavonoid content. We reported the first combined transcriptomic and proteomic analysis of the difference in flavonoid content in three haploid ginkgos to investigate the effect of haploidy. Haploids had always smaller leaves and flavonoid content than the diploids. The selected haploid had also generally lower gene dosage than the selected diploid, with 1149 up-regulated (46.8 %) and 1309 down-regulated (53.2 %) among 2452 differentially expressed genes (DEGs). Of 686 differentially expressed proteins (DEPs) detected, 289 proteins (42.1 %) were upregulated, and 397 proteins (57.9 %) were downregulated in haploids. A particular attention deserves the downregulation of PAL, PAM, FLS, OMT1 hub genes involved in flavonoid biosynthesis regulation. Our study confirms the trend of haploids to have lower metabolic contents and points that lower flavonoid content in ginkgo monoploids could be due to reduced dosage of the corresponding regulatory genes and downregulation of genes involved in flavonoid synthesis.
- Keywords
- Flavonoid content, Gene dose, Haploid ginkgo, Hub genes, Proteome, Transcriptome,
- MeSH
- Flavonoids metabolism MeSH
- Ginkgo biloba * genetics MeSH
- Haploidy MeSH
- Proteome genetics MeSH
- Proteomics MeSH
- Gene Expression Regulation, Plant MeSH
- Transcriptome * MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- Flavonoids MeSH
- Proteome MeSH
With the rise of next-generation sequencing methods, it has become increasingly possible to obtain genomewide sequence data even for nonmodel species. Such data are often used for the development of single nucleotide polymorphism (SNP) markers, which can subsequently be screened in a larger population sample using a variety of genotyping techniques. Many of these techniques require appropriate locus-specific PCR and genotyping primers. Currently, there is no publicly available software for the automated design of suitable PCR and genotyping primers from next-generation sequence data. Here we present a pipeline called Scrimer that automates multiple steps, including adaptor removal, read mapping, selection of SNPs and multiple primer design from transcriptome data. The designed primers can be used in conjunction with several widely used genotyping methods such as SNaPshot or MALDI-TOF genotyping. Scrimer is composed of several reusable modules and an interactive bash workflow that connects these modules. Even the basic steps are presented, so the workflow can be executed in a step-by-step manner. The use of standard formats throughout the pipeline allows data from various sources to be plugged in, as well as easy inspection of intermediate results with visualization tools of the user's choice.
- Keywords
- SNP genotyping, SNaPshot, next-generation sequencing, primer design, transcriptome,
- MeSH
- DNA Primers genetics MeSH
- Genotyping Techniques methods MeSH
- Polymerase Chain Reaction methods MeSH
- Sequence Analysis, DNA methods MeSH
- Transcriptome * MeSH
- Computational Biology methods MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Names of Substances
- DNA Primers MeSH