sequence database Dotaz Zobrazit nápovědu
Molecular identification of micro- and macroorganisms based on nuclear markers has revolutionized our understanding of their taxonomy, phylogeny and ecology. Today, research on the diversity of eukaryotes in global ecosystems heavily relies on nuclear ribosomal RNA (rRNA) markers. Here, we present the research community-curated reference database EUKARYOME for nuclear ribosomal 18S rRNA, internal transcribed spacer (ITS) and 28S rRNA markers for all eukaryotes, including metazoans (animals), protists, fungi and plants. It is particularly useful for the identification of arbuscular mycorrhizal fungi as it bridges the four commonly used molecular markers-ITS1, ITS2, 18S V4-V5 and 28S D1-D2 subregions. The key benefits of this database over other annotated reference sequence databases are that it is not restricted to certain taxonomic groups and it includes all rRNA markers. EUKARYOME also offers a number of reference long-read sequences that are derived from (meta)genomic and (meta)barcoding-a unique feature that can be used for taxonomic identification and chimera control of third-generation, long-read, high-throughput sequencing data. Taxonomic assignments of rRNA genes in the database are verified based on phylogenetic approaches. The reference datasets are available in multiple formats from the project homepage, http://www.eukaryome.org.
- MeSH
- databáze genetické MeSH
- databáze nukleových kyselin MeSH
- Eukaryota * genetika MeSH
- fylogeneze MeSH
- geny rRNA genetika MeSH
- RNA ribozomální 18S genetika MeSH
- zvířata MeSH
- Check Tag
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- RNA ribozomální 18S MeSH
Secondary data structure of RNA molecules provides insights into the identity and function of RNAs. With RNAs readily sequenced, the question of their structural characterization is increasingly important. However, RNA structure is difficult to acquire. Its experimental identification is extremely technically demanding, while computational prediction is not accurate enough, especially for large structures of long sequences. We address this difficult situation with rPredictorDB, a predictive database of RNA secondary structures that aims to form a middle ground between experimentally identified structures in PDB and predicted consensus secondary structures in Rfam. The database contains individual secondary structures predicted using a tool for template-based prediction of RNA secondary structure for the homologs of the RNA families with at least one homolog with experimentally solved structure. Experimentally identified structures are used as the structural templates and thus the prediction has higher reliability than de novo predictions in Rfam. The sequences are downloaded from public resources. So far rPredictorDB covers 7365 RNAs with their secondary structures. Plots of the secondary structures use the Traveler package for readable display of RNAs with long sequences and complex structures, such as ribosomal RNAs. The RNAs in the output of rPredictorDB are extensively annotated and can be viewed, browsed, searched and downloaded according to taxonomic, sequence and structure data. Additionally, structure of user-provided sequences can be predicted using the templates stored in rPredictorDB.
- MeSH
- databáze nukleových kyselin * MeSH
- konformace nukleové kyseliny * MeSH
- RNA * chemie genetika MeSH
- software * MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- RNA * MeSH
Searching for similar sequences in a database via BLAST or a similar tool is one of the most common bioinformatics tasks applied in general, and to non-coding RNAs in particular. However, the results of the search might be difficult to interpret due to the presence of partial matches to the database subject sequences. Here, we present rboAnalyzer - a tool that helps with interpreting sequence search result by (1) extending partial matches into plausible full-length subject sequences, (2) predicting homology of RNAs represented by full-length subject sequences to the query RNA, (3) pooling information across homologous RNAs found in the search results and public databases such as Rfam to predict more reliable secondary structures for all matches, and (4) contextualizing the matches by providing the prediction results and other relevant information in a rich graphical output. Using predicted full-length matches improves secondary structure prediction and makes rboAnalyzer robust with regards to identification of homology. The output of the tool should help the user to reliably characterize non-coding RNAs in BLAST output. The usefulness of the rboAnalyzer and its ability to correctly extend partial matches to full-length is demonstrated on known homologous RNAs. To allow the user to use custom databases and search options, rboAnalyzer accepts any search results as a text file in the BLAST format. The main output is an interactive HTML page displaying the computed characteristics and other context of the matches. The output can also be exported in an appropriate sequence and/or secondary structure formats.
- Klíčová slova
- RNA, RNA homology, database, search, secondary structure, sequence,
- Publikační typ
- časopisecké články MeSH
UNLABELLED: The Plant rDNA database (www.plantrdnadatabase.com) is an open access online resource providing detailed information on numbers, structures and positions of 5S and 18S-5.8S-26S (35S) ribosomal DNA loci. The data have been obtained from >600 publications on plant molecular cytogenetics, mostly based on fluorescent in situ hybridization (FISH). This edition of the database contains information on 1609 species derived from 2839 records, which means an expansion of 55.76 and 94.45%, respectively. It holds the data for angiosperms, gymnosperms, bryophytes and pteridophytes available as of June 2013. Information from publications reporting data for a single rDNA (either 5S or 35S alone) and annotation regarding transcriptional activity of 35S loci now appears in the database. Preliminary analyses suggest greater variability in the number of rDNA loci in gymnosperms than in angiosperms. New applications provide ideograms of the species showing the positions of rDNA loci as well as a visual representation of their genome sizes. We have also introduced other features to boost the usability of the Web interface, such as an application for convenient data export and a new section with rDNA-FISH-related information (mostly detailing protocols and reagents). In addition, we upgraded and/or proofread tabs and links and modified the website for a more dynamic appearance. This manuscript provides a synopsis of these changes and developments. DATABASE URL: http://www.plantrdnadatabase.com.
- MeSH
- databáze genetické * MeSH
- DNA rostlinná * MeSH
- internet MeSH
- ribozomální DNA * MeSH
- rostliny genetika MeSH
- systémy řízení databází * MeSH
- výpočetní biologie MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- přehledy MeSH
- Názvy látek
- DNA rostlinná * MeSH
- ribozomální DNA * MeSH
MOTIVATION: Tandemly organized repetitive sequences (satellite DNA) are widespread in complex eukaryotic genomes. In plants, satellite repeats often represent a substantial part of nuclear DNA but only a little is known about the molecular mechanisms of their amplification and their possible role(s) in genome evolution and function. Unfortunately, addressing these questions via characterization of general sequence properties of known satellite repeats has been hindered by a difficulty in obtaining a complete and unbiased set of sequence data for this analysis. This is mainly due to the presence of multiple entries of homologous sequences and of single entries that contain more than one repeated unit (monomer) in the public databases. RESULTS: We have established a computer database specialized for plant satellite repeats (PlantSat) that integrates sequence data available from various resources with supplementary information including repeat consensus sequences, abundances, and chromosomal localizations. The sequences are stored as individual repeat monomers grouped into families, which simplifies their computer analysis and makes it more accurate. Using this feature, we have performed a basic sequence analysis of the whole set of plant satellite repeats with respect to their monomer length and nucleotide composition. The analysis revealed several preferred length ranges of the monomers (approximately 165 bp and its multiples) and an over-representation of the AA/TT dinucleotide in the repeats. We have also detected an enrichment of satellite DNA sequences for the motif CAAAA that is supposed to be involved in breakage-reunion of repeated sequences.
- MeSH
- automatizované zpracování dat MeSH
- databáze nukleových kyselin * MeSH
- DNA rostlinná genetika MeSH
- internet MeSH
- rostliny genetika MeSH
- satelitní DNA genetika MeSH
- sekvenční analýza DNA statistika a číselné údaje MeSH
- software MeSH
- uživatelské rozhraní počítače MeSH
- výpočetní biologie MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- DNA rostlinná MeSH
- satelitní DNA MeSH
IRESite is an exhaustive, manually annotated non-redundant relational database focused on the IRES elements (Internal Ribosome Entry Site) and containing information not available in the primary public databases. IRES elements were originally found in eukaryotic viruses hijacking initiation of translation of their host. Later on, they were also discovered in 5'-untranslated regions of some eukaryotic mRNA molecules. Currently, IRESite presents up to 92 biologically relevant aspects of every experiment, e.g. the nature of an IRES element, its functionality/defectivity, origin, size, sequence, structure, its relative position with respect to surrounding protein coding regions, positive/negative controls used in the experiment, the reporter genes used to monitor IRES activity, the measured reporter protein yields/activities, and references to original publications as well as cross-references to other databases, and also comments from submitters and our curators. Furthermore, the site presents the known similarities to rRNA sequences as well as RNA-protein interactions. Special care is given to the annotation of promoter-like regions. The annotated data in IRESite are bound to mostly complete, full-length mRNA, and whenever possible, accompanied by original plasmid vector sequences. New data can be submitted through the publicly available web-based interface at http://www.iresite.org and are curated by a team of lab-experienced biologists.
- MeSH
- 5' nepřekládaná oblast chemie MeSH
- databáze nukleových kyselin * MeSH
- iniciace translace peptidového řetězce * MeSH
- iniciační faktory metabolismus MeSH
- internet MeSH
- messenger RNA chemie MeSH
- plazmidy chemie MeSH
- promotorové oblasti (genetika) MeSH
- regulační sekvence ribonukleových kyselin MeSH
- RNA virová chemie MeSH
- uživatelské rozhraní počítače MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- 5' nepřekládaná oblast MeSH
- iniciační faktory MeSH
- messenger RNA MeSH
- regulační sekvence ribonukleových kyselin MeSH
- RNA virová MeSH
BACKGROUND: Sequence variability in the hepatitis C virus (HCV) genome has led to the development and classification of six genotypes and a number of subtypes. The HCV 5' untranslated region mainly comprises an internal ribosomal entry site (IRES) responsible for cap-independent synthesis of the viral polyprotein and is conserved among all HCV genotypes. DESCRIPTION: Considering the possible high impact of variations in HCV IRES on viral protein production and thus virus replication, we decided to collect the available data on known nucleotide variants in the HCV IRES and their impact on IRES function in translation initiation. The HCV IRES variation database (HCVIVdb) is a collection of naturally occurring and engineered mutation entries for the HCV IRES. Each entry contains contextual information pertaining to the entry such as the HCV genotypic background and links to the original publication. Where available, quantitative data on the IRES efficiency in translation have been collated along with details on the reporter system used to generate the data. Data are displayed both in a tabular and graphical formats and allow direct comparison of results from different experiments. Together the data provide a central resource for researchers in the IRES and hepatitis C-oriented fields. CONCLUSION: The collation of over 1900 mutations enables systematic analysis of the HCV IRES. The database is mainly dedicated to detailed comparative and functional analysis of all the HCV IRES domains, which can further lead to the development of site-specific drug designs and provide a guide for future experiments. HCVIVdb is available at http://www.hcvivdb.org .
- Klíčová slova
- Database, HCV, Hepatitis C, IRES, Internal ribosome entry site, Translation efficiency,
- MeSH
- 5' nepřekládaná oblast MeSH
- databáze genetické * MeSH
- genotyp MeSH
- Hepacivirus genetika metabolismus MeSH
- hepatitida C virologie MeSH
- IRES genetika MeSH
- lidé MeSH
- mutace MeSH
- proteosyntéza MeSH
- RNA virová genetika MeSH
- sběr dat MeSH
- sekvence nukleotidů MeSH
- virové proteiny biosyntéza genetika MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- 5' nepřekládaná oblast MeSH
- IRES MeSH
- RNA virová MeSH
- virové proteiny MeSH
The human endogenous retroviruses database (HERVd) is maintained at the Institute of Molecular Genetics, Academy of Sciences of the Czech Republic, and is accessible via the World Wide Web at http://herv.img.cas.cz. The HERVd provides complex information on and analysis of retroviral elements found in the human genome. It can be used for searches of individual HERV families, identification of HERV parts, graphical output of HERV structures, comparison of HERVs and identification of retrovirus integration sites.
- MeSH
- databáze nukleových kyselin * MeSH
- endogenní retroviry genetika MeSH
- genom lidský MeSH
- integrace viru MeSH
- internet MeSH
- lidé MeSH
- molekulární evoluce MeSH
- myši MeSH
- systémy řízení databází MeSH
- ukládání a vyhledávání informací MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- myši MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
Proteins are the most abundant component of the cell nucleus, where they perform a plethora of functions, including the assembly of long DNA molecules into condensed chromatin, DNA replication and repair, regulation of gene expression, synthesis of RNA molecules and their modification. Proteins are important components of nuclear bodies and are involved in the maintenance of the nuclear architecture, transport across the nuclear envelope and cell division. Given their importance, the current poor knowledge of plant nuclear proteins and their dynamics during the cell's life and division is striking. Several factors hamper the analysis of the plant nuclear proteome, but the most critical seems to be the contamination of nuclei by cytosolic material during their isolation. With the availability of an efficient protocol for the purification of plant nuclei, based on flow cytometric sorting, contamination by cytoplasmic remnants can be minimized. Moreover, flow cytometry allows the separation of nuclei in different stages of the cell cycle (G1, S, and G2). This strategy has led to the identification of large number of nuclear proteins from barley (Hordeum vulgare), thus triggering the creation of a dedicated database called UNcleProt, http://barley.gambrinus.ueb.cas.cz/ .
- Klíčová slova
- barley, cell cycle, database, flow-cytometry, localization, mass spectrometry, nuclear proteome, nucleus,
- MeSH
- buněčný cyklus * MeSH
- data mining MeSH
- databáze proteinů * MeSH
- jaderné proteiny klasifikace metabolismus MeSH
- ječmen (rod) cytologie MeSH
- rostlinné proteiny klasifikace metabolismus MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- jaderné proteiny MeSH
- rostlinné proteiny MeSH
The accessibility of the partial genome sequence of Francisella tularensis strain Schu 4 was the starting point for a comprehensive proteome analysis of the intracellular pathogen F. tularensis. The main goal of this study is identification of protein candidates of value for the development of diagnostics, therapeutics and vaccines. In this review, the current status of 2-DE F. tularensis database building, approaches used for identification of biologically important subsets of F. tularensis proteins, and functional and topological assignments of identified proteins using various prediction programs and database homology searches are presented.
- MeSH
- bakteriální proteiny chemie genetika MeSH
- databáze proteinů * MeSH
- Francisella tularensis chemie MeSH
- proteom * MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- přehledy MeSH
- Názvy látek
- bakteriální proteiny MeSH
- proteom * MeSH