Molecular identification of micro- and macroorganisms based on nuclear markers has revolutionized our understanding of their taxonomy, phylogeny and ecology. Today, research on the diversity of eukaryotes in global ecosystems heavily relies on nuclear ribosomal RNA (rRNA) markers. Here, we present the research community-curated reference database EUKARYOME for nuclear ribosomal 18S rRNA, internal transcribed spacer (ITS) and 28S rRNA markers for all eukaryotes, including metazoans (animals), protists, fungi and plants. It is particularly useful for the identification of arbuscular mycorrhizal fungi as it bridges the four commonly used molecular markers-ITS1, ITS2, 18S V4-V5 and 28S D1-D2 subregions. The key benefits of this database over other annotated reference sequence databases are that it is not restricted to certain taxonomic groups and it includes all rRNA markers. EUKARYOME also offers a number of reference long-read sequences that are derived from (meta)genomic and (meta)barcoding-a unique feature that can be used for taxonomic identification and chimera control of third-generation, long-read, high-throughput sequencing data. Taxonomic assignments of rRNA genes in the database are verified based on phylogenetic approaches. The reference datasets are available in multiple formats from the project homepage, http://www.eukaryome.org.
- MeSH
- Databases, Genetic MeSH
- Databases, Nucleic Acid MeSH
- Eukaryota * genetics MeSH
- Phylogeny MeSH
- Genes, rRNA genetics MeSH
- RNA, Ribosomal, 18S genetics MeSH
- Animals MeSH
- Check Tag
- Animals MeSH
- Publication type
- Journal Article MeSH
Secondary data structure of RNA molecules provides insights into the identity and function of RNAs. With RNAs readily sequenced, the question of their structural characterization is increasingly important. However, RNA structure is difficult to acquire. Its experimental identification is extremely technically demanding, while computational prediction is not accurate enough, especially for large structures of long sequences. We address this difficult situation with rPredictorDB, a predictive database of RNA secondary structures that aims to form a middle ground between experimentally identified structures in PDB and predicted consensus secondary structures in Rfam. The database contains individual secondary structures predicted using a tool for template-based prediction of RNA secondary structure for the homologs of the RNA families with at least one homolog with experimentally solved structure. Experimentally identified structures are used as the structural templates and thus the prediction has higher reliability than de novo predictions in Rfam. The sequences are downloaded from public resources. So far rPredictorDB covers 7365 RNAs with their secondary structures. Plots of the secondary structures use the Traveler package for readable display of RNAs with long sequences and complex structures, such as ribosomal RNAs. The RNAs in the output of rPredictorDB are extensively annotated and can be viewed, browsed, searched and downloaded according to taxonomic, sequence and structure data. Additionally, structure of user-provided sequences can be predicted using the templates stored in rPredictorDB.
Jsou prezentovány první výsledky charakterizace kmenů Haemophilus influenzae novou metodoumultilokusové sekvenační typizace (MLST). Charakterizace souboru 28 kmenů H. influenzae izolovanýchz invazivních onemocnění v České republice ukázala klonální homogenitu těchto kmenů:z 26 testovaných kmenů H. influenzae b vykazovalo 22 kmenů jednotný sekvenční typ: ST-6. U 4kmenů byly zjištěny námi nově popsané sekvenční typy: ST-83 (3 kmeny) a ST-84 (1 kmen). U 2netypovatelných kmenů H. influenzae byly zjištěny jiné sekvenční typy než ST-6: ST-3 a námi nověpopsaný ST-85. První výsledky MLST ukazují, že ST-6 je typický pro H. influenzae b izolovanéz invazivních onemocnění v České republice. Námi nově popsané sekvenční typy ST-83, ST-84 a ST-85byly registrovány v celosvětové databázi MLST H. influenzae (http://haemophilus.mlst.net).
First results of multilocus sequence typing (MLST) of Haemophilus influenzae strains are presented.MLST of 28 H. influenzae strains isolated frompatients with invasive diseases in the Czech Republicis indicative of clonal homogeneity of these strains: 22 out of 26 H. influenzae b strains tested wereof the same sequence type, ST-6. Four strains were of two sequence types newly described in thisstudy: ST-83 (3 strains) and ST-84 (1 strain). Two nontypeable H. influenzae strains were assigned tosequence types other than ST-6: ST-3 and ST-85 newly described in this study. First MLST resultsshow ST-6 to be typical of H. influenzae b isolated from patients with invasive diseases in the CzechRepublic. The sequence types newly described in this study, i.e. ST-83, ST-84 and ST-85, weresubmitted to the worldwide H. influenzae MLST database (http://haemophilus.mlst.net).
- MeSH
- Clone Cells MeSH
- Research Support as Topic MeSH
- Haemophilus influenzae genetics pathogenicity MeSH
- Humans MeSH
- Polymerase Chain Reaction MeSH
- Base Sequence MeSH
- Serotyping methods MeSH
- Check Tag
- Humans MeSH
- Publication type
- Database MeSH
- Review MeSH
- Comparative Study MeSH
UNLABELLED: The Plant rDNA database (www.plantrdnadatabase.com) is an open access online resource providing detailed information on numbers, structures and positions of 5S and 18S-5.8S-26S (35S) ribosomal DNA loci. The data have been obtained from >600 publications on plant molecular cytogenetics, mostly based on fluorescent in situ hybridization (FISH). This edition of the database contains information on 1609 species derived from 2839 records, which means an expansion of 55.76 and 94.45%, respectively. It holds the data for angiosperms, gymnosperms, bryophytes and pteridophytes available as of June 2013. Information from publications reporting data for a single rDNA (either 5S or 35S alone) and annotation regarding transcriptional activity of 35S loci now appears in the database. Preliminary analyses suggest greater variability in the number of rDNA loci in gymnosperms than in angiosperms. New applications provide ideograms of the species showing the positions of rDNA loci as well as a visual representation of their genome sizes. We have also introduced other features to boost the usability of the Web interface, such as an application for convenient data export and a new section with rDNA-FISH-related information (mostly detailing protocols and reagents). In addition, we upgraded and/or proofread tabs and links and modified the website for a more dynamic appearance. This manuscript provides a synopsis of these changes and developments. DATABASE URL: http://www.plantrdnadatabase.com.
IRESite is an exhaustive, manually annotated non-redundant relational database focused on the IRES elements (Internal Ribosome Entry Site) and containing information not available in the primary public databases. IRES elements were originally found in eukaryotic viruses hijacking initiation of translation of their host. Later on, they were also discovered in 5'-untranslated regions of some eukaryotic mRNA molecules. Currently, IRESite presents up to 92 biologically relevant aspects of every experiment, e.g. the nature of an IRES element, its functionality/defectivity, origin, size, sequence, structure, its relative position with respect to surrounding protein coding regions, positive/negative controls used in the experiment, the reporter genes used to monitor IRES activity, the measured reporter protein yields/activities, and references to original publications as well as cross-references to other databases, and also comments from submitters and our curators. Furthermore, the site presents the known similarities to rRNA sequences as well as RNA-protein interactions. Special care is given to the annotation of promoter-like regions. The annotated data in IRESite are bound to mostly complete, full-length mRNA, and whenever possible, accompanied by original plasmid vector sequences. New data can be submitted through the publicly available web-based interface at http://www.iresite.org and are curated by a team of lab-experienced biologists.
- MeSH
- Databases, Nucleic Acid MeSH
- Financing, Organized MeSH
- Peptide Chain Initiation, Translational MeSH
- Peptide Initiation Factors metabolism MeSH
- Internet MeSH
- RNA, Messenger chemistry MeSH
- Untranslated Regions chemistry MeSH
- Plasmids chemistry MeSH
- Promoter Regions, Genetic MeSH
- Regulatory Sequences, Ribonucleic Acid MeSH
- RNA, Viral chemistry MeSH
- User-Computer Interface MeSH
mm -- PART I -- Database Fundamentals 1 -- Chapter 1 -- Database Overview 2 -- Minicase: Teck Information Systems, 2 -- 1.1 Database Systems, 3 -- 1.2 The Database Approach: Shareability and Cooperation, 6 Approach, 7 -- Database Approach Summary, 18 -- 1.4 Roles, 18 -- Database Administrator, 18 Systems Database -- Design 193 -- 5.1 Relational Database Design, 194 Mapping to the Logical Database, 194 Physical Design of Relational Database, 196 -- 5.2 Network Database Design, 198 Mapping to the Logical Database
1st ed. ix, 781 s.
- MeSH
- Databases, Nucleic Acid MeSH
- Databases, Protein MeSH
- Fungi genetics immunology MeSH
- Protein Kinase Inhibitors MeSH
- Cell Adhesion Molecules analysis genetics MeSH
- Protein Kinases analysis genetics MeSH
- Sequence Homology, Amino Acid MeSH
- Somatic Hypermutation, Immunoglobulin genetics immunology MeSH
Emerging infectious diseases (EIDs) are a severe problem caused by fungi in human and plant species across the world. They pose a worldwide threat to food security as well as human health. Fungal infections are increasing now day by day worldwide, and the current antimycotic drugs are not effective due to the emergence of resistant strains. Therefore, it is an urgent need for the finding of new plant-origin antifungal peptides (PhytoAFPs). Huge numbers of peptides were extracted from different plant species which play a protective role against fungal infection. Hundreds of plant-origin peptides with antifungal activity have already been reported. So there is a requirement of a dedicated platform which systematically catalogs plant-origin peptides along with their antifungal properties. PlantAFP database is a resource of experimentally verified plant-origin antifungal peptides, collected from research articles, patents, and public databases. The current release of PlantAFP database contains 2585 peptide entries among which 510 are unique peptides. Each entry provides comprehensive information of a peptide that includes its peptide sequence, peptide name, peptide class, length of the peptide, molecular mass, antifungal activity, and origin of peptides. Besides this primary information, PlantAFP stores peptide sequences in SMILES format. In order to facilitate the user, many tools have been integrated into this database that includes BLAST search, peptide search, SMILES search, and peptide-mapping is also included in the database. PlantAFP database is accessible at http://bioinformatics.cimap.res.in/sharma/PlantAFP/.
Genetic variation occurring within conserved functional protein domains warrants special attention when examining DNA variation in the context of disease causation. Here we introduce a resource, freely available at www.prot2hg.com, that addresses the question of whether a particular variant falls onto an annotated protein domain and directly translates chromosomal coordinates onto protein residues. The tool can perform a multiple-site query in a simple way, and the whole dataset is available for download as well as incorporated into our own accessible pipeline. To create this resource, National Center for Biotechnology Information protein data were retrieved using the Entrez Programming Utilities. After processing all human protein domains, residue positions were reverse translated and mapped to the reference genome hg19 and stored in a MySQL database. In total, 760 487 protein domains from 42 371 protein models were mapped to hg19 coordinates and made publicly available for search or download (www.prot2hg.com). In addition, this annotation was implemented into the genomics research platform GENESIS in order to query nearly 8000 exomes and genomes of families with rare Mendelian disorders (tgp-foundation.org). When applied to patient genetic data, we found that rare (<1%) variants in the Genome Aggregation Database were significantly more annotated onto a protein domain in comparison to common (>1%) variants. Similarly, variants described as pathogenic or likely pathogenic in ClinVar were more likely to be annotated onto a domain. In addition, we tested a dataset consisting of 60 causal variants in a cohort of patients with epileptic encephalopathy and found that 71% of them (43 variants) were propagated onto protein domains. In summary, we developed a resource that annotates variants in the coding part of the genome onto conserved protein domains in order to increase variant prioritization efficiency.Database URL: www.prot2hg.com.
- MeSH
- Molecular Sequence Annotation methods MeSH
- Data Mining methods MeSH
- Databases, Genetic * MeSH
- Data Curation methods MeSH
- Genetic Variation * MeSH
- Genome, Human genetics MeSH
- Genomics methods MeSH
- Internet MeSH
- Humans MeSH
- Protein Domains genetics MeSH
- Proteins chemistry genetics metabolism MeSH
- Computational Biology methods MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
The ability to decode antigen specificities encapsulated in the sequences of rearranged T-cell receptor (TCR) genes is critical for our understanding of the adaptive immune system and promises significant advances in the field of translational medicine. Recent developments in high-throughput sequencing methods (immune repertoire sequencing technology, or RepSeq) and single-cell RNA sequencing technology have allowed us to obtain huge numbers of TCR sequences from donor samples and link them to T-cell phenotypes. However, our ability to annotate these TCR sequences still lags behind, owing to the enormous diversity of the TCR repertoire and the scarcity of available data on T-cell specificities. In this paper, we present VDJdb, a database that stores and aggregates the results of published T-cell specificity assays and provides a universal platform that couples antigen specificities with TCR sequences. We demonstrate that VDJdb is a versatile instrument for the annotation of TCR repertoire data, enabling a concatenated view of antigen-specific TCR sequence motifs. VDJdb can be accessed at https://vdjdb.cdr3.net and https://github.com/antigenomics/vdjdb-db.
- MeSH
- Single-Cell Analysis MeSH
- Molecular Sequence Annotation * MeSH
- Antigens chemistry immunology metabolism MeSH
- Databases, Protein * MeSH
- Major Histocompatibility Complex genetics immunology MeSH
- Protein Interaction Domains and Motifs MeSH
- Internet MeSH
- Humans MeSH
- Macaca mulatta MeSH
- Models, Molecular MeSH
- Mice MeSH
- Receptors, Antigen, T-Cell chemistry immunology metabolism MeSH
- Protein Structure, Secondary MeSH
- Amino Acid Sequence MeSH
- Sequence Homology, Amino Acid MeSH
- Sequence Alignment MeSH
- Software * MeSH
- T-Lymphocytes cytology immunology MeSH
- Protein Binding MeSH
- Binding Sites MeSH
- High-Throughput Nucleotide Sequencing MeSH
- Animals MeSH
- Check Tag
- Humans MeSH
- Mice MeSH
- Animals MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH