Secondary structure elements (SSEs) are inherent parts of protein structures, and their arrangement is characteristic for each protein family. Therefore, annotation of SSEs can facilitate orientation in the vast number of homologous structures which is now available for many protein families. It also provides a way to identify and annotate the key regions, like active sites and channels, and subsequently answer the key research questions, such as understanding of molecular function and its variability.This chapter introduces the concept of SSE annotation and describes the workflow for obtaining SSE annotation for the members of a selected protein family using program SecStrAnnotator.
MOLEonline is an interactive, web-based application for the detection and characterization of channels (pores and tunnels) within biomacromolecular structures. The updated version of MOLEonline overcomes limitations of the previous version by incorporating the recently developed LiteMol Viewer visualization engine and providing a simple, fully interactive user experience. The application enables two modes of calculation: one is dedicated to the analysis of channels while the other was specifically designed for transmembrane pores. As the application can use both PDB and mmCIF formats, it can be leveraged to analyze a wide spectrum of biomacromolecular structures, e.g. stemming from NMR, X-ray and cryo-EM techniques. The tool is interconnected with other bioinformatics tools (e.g., PDBe, CSA, ChannelsDB, OPM, UniProt) to help both setup and the analysis of acquired results. MOLEonline provides unprecedented analytics for the detection and structural characterization of channels, as well as information about their numerous physicochemical features. Here we present the application of MOLEonline for structural analyses of α-hemolysin and transient receptor potential mucolipin 1 (TRMP1) pores. The MOLEonline application is freely available via the Internet at https://mole.upol.cz.
Realising the importance of assessing the quality of the biomolecular structures deposited in the Protein Data Bank (PDB), the Worldwide Protein Data Bank (wwPDB) partners established Validation Task Forces to obtain advice on the methods and standards to be used to validate structures determined by X-ray crystallography, nuclear magnetic resonance spectroscopy and three-dimensional electron cryo-microscopy. The resulting wwPDB validation pipeline is an integral part of the wwPDB OneDep deposition, biocuration and validation system. The wwPDB Validation Service webserver (https://validate.wwpdb.org) can be used to perform checks prior to deposition. Here, it is shown how validation metrics can be combined to produce an overall score that allows the ranking of macromolecular structures and domains in search results. The ValTrendsDB database provides users with a convenient way to access and analyse validation information and other properties of X-ray crystal structures in the PDB, including investigating trends in and correlations between different structure properties and validation metrics.
- MeSH
- databáze proteinů normy MeSH
- datové kurátorství MeSH
- elektronová kryomikroskopie MeSH
- internet * MeSH
- konformace proteinů * MeSH
- lidé MeSH
- makromolekulární látky chemie MeSH
- molekulární modely MeSH
- nukleární magnetická rezonance biomolekulární MeSH
- proteiny analýza chemie MeSH
- uživatelské rozhraní počítače * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- validační studie MeSH
Crystallographic studies of ligands bound to biological macromolecules (proteins and nucleic acids) play a crucial role in structure-guided drug discovery and design, and also provide atomic level insights into the physical chemistry of complex formation between macromolecules and ligands. The quality with which small-molecule ligands have been modelled in Protein Data Bank (PDB) entries has been, and continues to be, a matter of concern for many investigators. Correctly interpreting whether electron density found in a binding site is compatible with the soaked or co-crystallized ligand or represents water or buffer molecules is often far from trivial. The Worldwide PDB validation report (VR) provides a mechanism to highlight any major issues concerning the quality of the data and the model at the time of deposition and annotation, so the depositors can fix issues, resulting in improved data quality. The ligand-validation methods used in the generation of the current VRs are described in detail, including an examination of the metrics to assess both geometry and electron-density fit. It is found that the LLDF score currently used to identify ligand electron-density fit outliers can give misleading results and that better ligand-validation metrics are required.
- MeSH
- databáze proteinů * MeSH
- konformace proteinů * MeSH
- krystalografie rentgenová MeSH
- lidé MeSH
- ligandy MeSH
- makromolekulární látky chemie MeSH
- molekulární modely MeSH
- molekulární struktura MeSH
- proteiny analýza chemie MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- validační studie MeSH
ChannelsDB (http://ncbr.muni.cz/ChannelsDB) is a database providing information about the positions, geometry and physicochemical properties of channels (pores and tunnels) found within biomacromolecular structures deposited in the Protein Data Bank. Channels were deposited from two sources; from literature using manual deposition and from a software tool automatically detecting tunnels leading to the enzymatic active sites and selected cofactors, and transmembrane pores. The database stores information about geometrical features (e.g. length and radius profile along a channel) and physicochemical properties involving polarity, hydrophobicity, hydropathy, charge and mutability. The stored data are interlinked with available UniProt annotation data mapping known mutation effects to channel-lining residues. All structures with channels are displayed in a clear interactive manner, further facilitating data manipulation and interpretation. As such, ChannelsDB provides an invaluable resource for research related to deciphering the biological function of biomacromolecular channels.
- MeSH
- aminokyseliny chemie metabolismus MeSH
- cytochrom P-450 CYP2D6 chemie genetika metabolismus MeSH
- databáze proteinů * MeSH
- eukaryotické buňky cytologie enzymologie MeSH
- exprese genu MeSH
- hydrofobní a hydrofilní interakce MeSH
- iontové kanály chemie genetika metabolismus MeSH
- jaderný pór chemie genetika metabolismus MeSH
- katalytická doména MeSH
- koenzymy chemie metabolismus MeSH
- lidé MeSH
- mutace MeSH
- prokaryotické buňky cytologie enzymologie MeSH
- software * MeSH
- statická elektřina MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
Scientific research relies on computer software, yet software is not always developed following practices that ensure its quality and sustainability. This manuscript does not aim to propose new software development best practices, but rather to provide simple recommendations that encourage the adoption of existing best practices. Software development best practices promote better quality software, and better quality software improves the reproducibility and reusability of research. These recommendations are designed around Open Source values, and provide practical suggestions that contribute to making research software and its source code more discoverable, reusable and transparent. This manuscript is aimed at developers, but also at organisations, projects, journals and funders that can increase the quality and sustainability of research software by encouraging the adoption of these recommendations.
- Publikační typ
- časopisecké články MeSH
Metrics for assessing adoption of good development practices are a useful way to ensure that software is sustainable, reusable and functional. Sustainability means that the software used today will be available - and continue to be improved and supported - in the future. We report here an initial set of metrics that measure good practices in software development. This initiative differs from previously developed efforts in being a community-driven grassroots approach where experts from different organisations propose good software practices that have reasonable potential to be adopted by the communities they represent. We not only focus our efforts on understanding and prioritising good practices, we assess their feasibility for implementation and publish them here.
- Publikační typ
- časopisecké články MeSH
Following the discovery of serious errors in the structure of biomacromolecules, structure validation has become a key topic of research, especially for ligands and non-standard residues. ValidatorDB (freely available at http://ncbr.muni.cz/ValidatorDB) offers a new step in this direction, in the form of a database of validation results for all ligands and non-standard residues from the Protein Data Bank (all molecules with seven or more heavy atoms). Model molecules from the wwPDB Chemical Component Dictionary are used as reference during validation. ValidatorDB covers the main aspects of validation of annotation, and additionally introduces several useful validation analyses. The most significant is the classification of chirality errors, allowing the user to distinguish between serious issues and minor inconsistencies. Other such analyses are able to report, for example, completely erroneous ligands, alternate conformations or complete identity with the model molecules. All results are systematically classified into categories, and statistical evaluations are performed. In addition to detailed validation reports for each molecule, ValidatorDB provides summaries of the validation results for the entire PDB, for sets of molecules sharing the same annotation (three-letter code) or the same PDB entry, and for user-defined selections of annotations or PDB entries.
The acid dissociation constant is an important molecular property, and it can be successfully predicted by Quantitative Structure-Property Relationship (QSPR) models, even for in silico designed molecules. We analyzed how the methodology of in silico 3D structure preparation influences the quality of QSPR models. Specifically, we evaluated and compared QSPR models based on six different 3D structure sources (DTP NCI, Pubchem, Balloon, Frog2, OpenBabel, and RDKit) combined with four different types of optimization. These analyses were performed for three classes of molecules (phenols, carboxylic acids, anilines), and the QSPR model descriptors were quantum mechanical (QM) and empirical partial atomic charges. Specifically, we developed 516 QSPR models and afterward systematically analyzed the influence of the 3D structure source and other factors on their quality. Our results confirmed that QSPR models based on partial atomic charges are able to predict pKa with high accuracy. We also confirmed that ab initio and semiempirical QM charges provide very accurate QSPR models and using empirical charges based on electronegativity equalization is also acceptable, as well as advantageous, because their calculation is very fast. On the other hand, Gasteiger-Marsili empirical charges are not applicable for pKa prediction. We later found that QSPR models for some classes of molecules (carboxylic acids) are less accurate. In this context, we compared the influence of different 3D structure sources. We found that an appropriate selection of 3D structure source and optimization method is essential for the successful QSPR modeling of pKa. Specifically, the 3D structures from the DTP NCI and Pubchem databases performed the best, as they provided very accurate QSPR models for all the tested molecular classes and charge calculation approaches, and they do not require optimization. Also, Frog2 performed very well. Other 3D structure sources can also be used but are not so robust, and an unfortunate combination of molecular class and charge calculation approach can produce weak QSPR models. Additionally, these 3D structures generally need optimization in order to produce good quality QSPR models.
- MeSH
- chemické jevy * MeSH
- kvantitativní vztahy mezi strukturou a aktivitou * MeSH
- kvantová teorie MeSH
- molekulární konformace * MeSH
- molekulární modely * MeSH
- počítačová simulace MeSH
- racionální návrh léčiv MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Research Support, N.I.H., Extramural MeSH
Well defined biomacromolecular patterns such as binding sites, catalytic sites, specific protein or nucleic acid sequences, etc. precisely modulate many important biological phenomena. We introduce PatternQuery, a web-based application designed for detection and fast extraction of such patterns. The application uses a unique query language with Python-like syntax to define the patterns that will be extracted from datasets provided by the user, or from the entire Protein Data Bank (PDB). Moreover, the database-wide search can be restricted using a variety of criteria, such as PDB ID, resolution, and organism of origin, to provide only relevant data. The extraction generally takes a few seconds for several hundreds of entries, up to approximately one hour for the whole PDB. The detected patterns are made available for download to enable further processing, as well as presented in a clear tabular and graphical form directly in the browser. The unique design of the language and the provided service could pave the way towards novel PDB-wide analyses, which were either difficult or unfeasible in the past. The application is available free of charge at http://ncbr.muni.cz/PatternQuery.