Communities structure
Dotaz
Zobrazit nápovědu
The Protein Data Bank in Europe (PDBe), a founding member of the Worldwide Protein Data Bank (wwPDB), actively participates in the deposition, curation, validation, archiving and dissemination of macromolecular structure data. PDBe supports diverse research communities in their use of macromolecular structures by enriching the PDB data and by providing advanced tools and services for effective data access, visualization and analysis. This paper details the enrichment of data at PDBe, including mapping of RNA structures to Rfam, and identification of molecules that act as cofactors. PDBe has developed an advanced search facility with ∼100 data categories and sequence searches. New features have been included in the LiteMol viewer at PDBe, with updated visualization of carbohydrates and nucleic acids. Small molecules are now mapped more extensively to external databases and their visual representation has been enhanced. These advances help users to more easily find and interpret macromolecular structure data in order to solve scientific problems.
- MeSH
- databáze proteinů * MeSH
- konformace proteinů MeSH
- shluková analýza MeSH
- software * MeSH
- správnost dat MeSH
- uživatelské rozhraní počítače MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Geografické názvy
- Evropa MeSH
Non-coding RNAs (ncRNA) are essential for all life, and their functions often depend on their secondary (2D) and tertiary structure. Despite the abundance of software for the visualisation of ncRNAs, few automatically generate consistent and recognisable 2D layouts, which makes it challenging for users to construct, compare and analyse structures. Here, we present R2DT, a method for predicting and visualising a wide range of RNA structures in standardised layouts. R2DT is based on a library of 3,647 templates representing the majority of known structured RNAs. R2DT has been applied to ncRNA sequences from the RNAcentral database and produced >13 million diagrams, creating the world's largest RNA 2D structure dataset. The software is amenable to community expansion, and is freely available at https://github.com/rnacentral/R2DT and a web server is found at https://rnacentral.org/r2dt .
- MeSH
- databáze nukleových kyselin MeSH
- konformace nukleové kyseliny MeSH
- nekódující RNA chemie MeSH
- reprodukovatelnost výsledků MeSH
- RNA chemie MeSH
- sekvenční analýza RNA MeSH
- software MeSH
- výpočetní biologie metody MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Research Support, N.I.H., Intramural MeSH
- Research Support, U.S. Gov't, Non-P.H.S. MeSH
The archiving and dissemination of protein and nucleic acid structures as well as their structural, functional and biophysical annotations is an essential task that enables the broader scientific community to conduct impactful research in multiple fields of the life sciences. The Protein Data Bank in Europe (PDBe; pdbe.org) team develops and maintains several databases and web services to address this fundamental need. From data archiving as a member of the Worldwide PDB consortium (wwPDB; wwpdb.org), to the PDBe Knowledge Base (PDBe-KB; pdbekb.org), we provide data, data-access mechanisms, and visualizations that facilitate basic and applied research and education across the life sciences. Here, we provide an overview of the structural data and annotations that we integrate and make freely available. We describe the web services and data visualization tools we offer, and provide information on how to effectively use or even further develop them. Finally, we discuss the direction of our data services, and how we aim to tackle new challenges that arise from the recent, unprecedented advances in the field of structure determination and protein structure modeling.
- MeSH
- databáze proteinů MeSH
- konformace proteinů MeSH
- nukleové kyseliny * MeSH
- proteiny * chemie MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Geografické názvy
- Evropa MeSH
The concept of operational taxonomic units (OTUs), which constructs "mathematically" defined taxa, is widely accepted and applied to describe bacterial communities using amplicon sequencing of 16S rRNA gene. OTUs are often used to infer functional traits since they are considered to fairly represent of community members. However, the link between molecular taxa, real taxa, and OTUs seems to be much more complicated. Strains of the same bacterial species (ideally belonging to the same OTU) typically only share some genes (the core genome), while other genes are strain-specific and unique. It is thus unclear to what extent are important functional traits homogeneous within an OTU and how correctly can functional traits be inferred for individual OTU members. Here, we have tested in silico the similarity of all genes and, more specifically, the set of genes encoding for glycoside hydrolases (GH) in bacterial genomes that belong to the same OTU. Genome similarity varied among OTUs, but as many as 5-78% of genes were not shared between the two bacterial genomes in the pair. The complement of GH families (the presence of gene families and the number of genes per family) differed in 95% of OTUs. In average, 43% of GH families either differed in gene counts or were present in one genome and absent in the other. These results show a serious limitation of the OTU-based approaches when used to infer the functional traits of bacterial communities and open the questions how to link environmental sequencing data and microbial functions.
- MeSH
- Bacteria klasifikace genetika MeSH
- bakteriální geny genetika MeSH
- databáze nukleových kyselin MeSH
- DNA bakterií genetika MeSH
- fylogeneze MeSH
- genetická variace MeSH
- genom bakteriální genetika MeSH
- glykosidhydrolasy genetika MeSH
- metagenomika * MeSH
- mikrobiota MeSH
- RNA ribozomální 16S genetika MeSH
- sekvenční analýza DNA MeSH
- Publikační typ
- časopisecké články MeSH
SUMMARY: Structures in PDB tend to contain errors. This is a very serious issue for authors that rely on such potentially problematic data. The community of structural biologists develops validation methods as countermeasures, which are also included in the PDB deposition system. But how are these validation efforts influencing the structure quality of subsequently published data? Which quality aspects are improving, and which remain problematic? We developed ValTrendsDB, a database that provides the results of an extensive exploratory analysis of relationships between quality criteria, size and metadata of biomacromolecules. Key input data are sourced from PDB. The discovered trends are presented via precomputed information-rich plots. ValTrendsDB also supports the visualization of a set of user-defined structures on top of general quality trends. Therefore, ValTrendsDB enables users to see the quality of structures published by selected author, laboratory or journal, discover quality outliers, etc. ValTrendsDB is updated weekly. AVAILABILITY AND IMPLEMENTATION: Freely accessible at http://ncbr.muni.cz/ValTrendsDB. The web interface was implemented in JavaScript. The database was implemented in C++. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
A detailed quantum chemical study on five peptides (WG, WGG, FGG, GGF and GFA) containing the residues phenylalanyl (F), glycyl (G), tryptophyl (W) and alanyl (A) -- where F and W are of aromatic character -- is presented. When investigating isolated small peptides, the dispersion interaction is the dominant attractive force in the peptide backbone-aromatic side chain intramolecular interaction. Consequently, an accurate theoretical study of these systems requires the use of a methodology covering properly the London dispersion forces. For this reason we have assessed the performance of the MP2, SCS-MP2, MP3, TPSS-D, PBE-D, M06-2X, BH&H, TPSS, B3LYP, tight-binding DFT-D methods and ff99 empirical force field compared to CCSD(T)/complete basis set (CBS) limit benchmark data. All the DFT techniques with a '-D' symbol have been augmented by empirical dispersion energy while the M06-2X functional was parameterized to cover the London dispersion energy. For the systems here studied we have concluded that the use of the ff99 force field is not recommended mainly due to problems concerning the assignment of reliable atomic charges. Tight-binding DFT-D is efficient as a screening tool providing reliable geometries. Among the DFT functionals, the M06-2X and TPSS-D show the best performance what is explained by the fact that both procedures cover the dispersion energy. The B3LYP and TPSS functionals-not covering this energy-fail systematically. Both, electronic energies and geometries obtained by means of the wave-function theory methods compare satisfactorily with the CCSD(T)/CBS benchmark data.
The p53 family of transcription factors plays key roles in development, genome stability, senescence and tumor development, and p53 is the most important tumor suppressor protein in humans. Although intensively investigated for many years, its initial evolutionary history is not yet fully elucidated. Using bioinformatic and structure prediction methods on current databases containing newly-sequenced genomes and transcriptomes, we present a detailed characterization of p53 family homologs in remote members of the Holozoa group, in the unicellular clades Filasterea, Ichthyosporea and Corallochytrea. Moreover, we show that these newly characterized homologous sequences contain domains that can form structures with high similarity to the human p53 family DNA-binding domain, and some also show similarities to the oligomerization and SAM domains. The presence of these remote homologs demonstrates an ancient origin of the p53 protein family.
- MeSH
- databáze genetické MeSH
- Eukaryota klasifikace genetika MeSH
- exony MeSH
- fylogeneze MeSH
- interakční proteinové domény a motivy MeSH
- introny MeSH
- konformace proteinů MeSH
- molekulární evoluce * MeSH
- molekulární modely MeSH
- multigenová rodina * MeSH
- nádorový supresorový protein p53 chemie genetika metabolismus MeSH
- sekvence aminokyselin MeSH
- sekvenční homologie aminokyselin * MeSH
- Publikační typ
- časopisecké články MeSH
The online resource http://www.plantrdnadatabase.com/ stores information on the number, chromosomal locations and structure of the 5S and 18S-5.8S-26S (35S) ribosomal DNAs (rDNA) in plants. This resource was exploited to study relationships between rDNA locus number, distribution, the occurrence of linked (L-type) and separated (S-type) 5S and 35S rDNA units, chromosome number, genome size and ploidy level. The analyses presented summarise current knowledge on rDNA locus numbers and distribution in plants. We analysed 2949 karyotypes, from 1791 species and 86 plant families, and performed ancestral character state reconstructions. The ancestral karyotype (2n = 16) has two terminal 35S sites and two interstitial 5S sites, while the median (2n = 24) presents four terminal 35S sites and three interstitial 5S sites. Whilst 86.57% of karyotypes show S-type organisation (ancestral condition), the L-type arrangement has arisen independently several times during plant evolution. A non-terminal position of 35S rDNA was found in about 25% of single-locus karyotypes, suggesting that terminal locations are not essential for functionality and expression. Single-locus karyotypes are very common, even in polyploids. In this regard, polyploidy is followed by subsequent locus loss. This results in a decrease in locus number per monoploid genome, forming part of the diploidisation process returning polyploids to a diploid-like state over time.
- MeSH
- chromozomy rostlin genetika MeSH
- databáze genetické MeSH
- DNA rostlinná genetika MeSH
- fylogeneze MeSH
- geny rRNA genetika MeSH
- karyotyp MeSH
- ribozomální DNA genetika MeSH
- RNA ribozomální 18S genetika MeSH
- RNA ribozomální 5S genetika MeSH
- rostliny genetika MeSH
- vyšší rostliny genetika MeSH
- Publikační typ
- časopisecké články MeSH
IRESite is an exhaustive, manually annotated non-redundant relational database focused on the IRES elements (Internal Ribosome Entry Site) and containing information not available in the primary public databases. IRES elements were originally found in eukaryotic viruses hijacking initiation of translation of their host. Later on, they were also discovered in 5'-untranslated regions of some eukaryotic mRNA molecules. Currently, IRESite presents up to 92 biologically relevant aspects of every experiment, e.g. the nature of an IRES element, its functionality/defectivity, origin, size, sequence, structure, its relative position with respect to surrounding protein coding regions, positive/negative controls used in the experiment, the reporter genes used to monitor IRES activity, the measured reporter protein yields/activities, and references to original publications as well as cross-references to other databases, and also comments from submitters and our curators. Furthermore, the site presents the known similarities to rRNA sequences as well as RNA-protein interactions. Special care is given to the annotation of promoter-like regions. The annotated data in IRESite are bound to mostly complete, full-length mRNA, and whenever possible, accompanied by original plasmid vector sequences. New data can be submitted through the publicly available web-based interface at http://www.iresite.org and are curated by a team of lab-experienced biologists.
- MeSH
- databáze nukleových kyselin MeSH
- financování organizované MeSH
- iniciace translace peptidového řetězce MeSH
- iniciační faktory metabolismus MeSH
- internet MeSH
- messenger RNA chemie MeSH
- nepřekládané oblasti chemie MeSH
- plazmidy chemie MeSH
- promotorové oblasti (genetika) MeSH
- regulační sekvence ribonukleových kyselin MeSH
- RNA virová chemie MeSH
- uživatelské rozhraní počítače MeSH
Single nucleotide variants represent a prevalent form of genetic variation. Mutations in the coding regions are frequently associated with the development of various genetic diseases. Computational tools for the prediction of the effects of mutations on protein function are very important for analysis of single nucleotide variants and their prioritization for experimental characterization. Many computational tools are already widely employed for this purpose. Unfortunately, their comparison and further improvement is hindered by large overlaps between the training datasets and benchmark datasets, which lead to biased and overly optimistic reported performances. In this study, we have constructed three independent datasets by removing all duplicities, inconsistencies and mutations previously used in the training of evaluated tools. The benchmark dataset containing over 43,000 mutations was employed for the unbiased evaluation of eight established prediction tools: MAPP, nsSNPAnalyzer, PANTHER, PhD-SNP, PolyPhen-1, PolyPhen-2, SIFT and SNAP. The six best performing tools were combined into a consensus classifier PredictSNP, resulting into significantly improved prediction performance, and at the same time returned results for all mutations, confirming that consensus prediction represents an accurate and robust alternative to the predictions delivered by individual tools. A user-friendly web interface enables easy access to all eight prediction tools, the consensus classifier PredictSNP and annotations from the Protein Mutant Database and the UniProt database. The web server and the datasets are freely available to the academic community at http://loschmidt.chemi.muni.cz/predictsnp.
- MeSH
- algoritmy MeSH
- databáze proteinů MeSH
- fylogeneze MeSH
- genetická variace MeSH
- genetické nemoci vrozené genetika MeSH
- genom lidský MeSH
- internet MeSH
- jednonukleotidový polymorfismus * MeSH
- lidé MeSH
- mutace * MeSH
- počítačová simulace MeSH
- software MeSH
- výpočetní biologie metody MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH