Most cited article - PubMed ID 31691821
PDBe: improved findability of macromolecular structure data in the PDB
The easiest and often most useful way to work with experimentally determined or computationally predicted structures of biomolecules is by viewing their three-dimensional (3D) shapes using a molecular visualization tool. Mol* was collaboratively developed by RCSB Protein Data Bank (RCSB PDB, RCSB.org) and Protein Data Bank in Europe (PDBe, PDBe.org) as an open-source, web-based, 3D visualization software suite for examination and analyses of biostructures. It is capable of displaying atomic coordinates and related experimental data of biomolecular structures together with a variety of annotations, facilitating basic and applied research, training, education, and information dissemination. Across RCSB.org, the RCSB PDB research-focused web portal, Mol* has been implemented to support single-mouse-click atomic-level visualization of biomolecules (e.g., proteins, nucleic acids, carbohydrates) with bound cofactors, small-molecule ligands, ions, water molecules, or other macromolecules. RCSB.org Mol* can seamlessly display 3D structures from various sources, allowing structure interrogation, superimposition, and comparison. Using influenza A H5N1 virus as a topical case study of an important pathogen, we exemplify how Mol* has been embedded within various RCSB.org tools-allowing users to view polymer sequence and structure-based annotations integrated from trusted bioinformatics data resources, assess patterns and trends in groups of structures, and view structures of any size and compositional complexity. In addition to being linked to every experimentally determined biostructure and Computed Structure Model made available at RCSB.org, Standalone Mol* is freely available for visualizing any atomic-level or multi-scale biostructure at rcsb.org/3d-view.
- Keywords
- 3D biostructure, Protein Data Bank, global health, influenza A H5N1 virus, molecular visualization, open‐source, pandemic preparedness, viral pathogen, virus life cycle, web‐based,
- MeSH
- Databases, Protein MeSH
- Internet MeSH
- Protein Conformation MeSH
- Models, Molecular MeSH
- Proteome * chemistry MeSH
- Software * MeSH
- Viral Proteins * chemistry MeSH
- Influenza A Virus, H5N1 Subtype * chemistry MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- Proteome * MeSH
- Viral Proteins * MeSH
RNA secondary (2D) structure visualization is an essential tool for understanding RNA function. R2DT is a software package designed to visualize RNA 2D structures in consistent, recognizable, and reproducible layouts. The latest release, R2DT 2.0, introduces multiple significant features, including the ability to display position-specific information, such as single nucleotide polymorphisms or SHAPE reactivities. It also offers a new template-free mode allowing visualization of RNAs without pre-existing templates, alongside a constrained folding mode and support for animated visualizations. Users can interactively modify R2DT diagrams, either manually or using natural language prompts, to generate new templates or create publication-quality images. Additionally, R2DT features faster performance, an expanded template library, and a growing collection of compatible tools and utilities. Already integrated into multiple biological databases, R2DT has evolved into a comprehensive platform for RNA 2D visualization, accessible at https://r2dt.bio.
- MeSH
- Polymorphism, Single Nucleotide MeSH
- Nucleic Acid Conformation * MeSH
- Computer Graphics MeSH
- RNA * chemistry genetics MeSH
- RNA Folding MeSH
- Software * MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- RNA * MeSH
Recent advances in AI-based methods have revolutionized the field of structural biology. Concomitantly, high-throughput sequencing and functional genomics have generated genetic variants at an unprecedented scale. However, efficient tools and resources are needed to link disparate data types-to 'map' variants onto protein structures, to better understand how the variation causes disease, and thereby design therapeutics. Here we present the Genomics 2 Proteins portal ( https://g2p.broadinstitute.org/ ): a human proteome-wide resource that maps 20,076,998 genetic variants onto 42,413 protein sequences and 77,923 structures, with a comprehensive set of structural and functional features. Additionally, the Genomics 2 Proteins portal allows users to interactively upload protein residue-wise annotations (for example, variants and scores) as well as the protein structure beyond databases to establish the connection between genomics to proteins. The portal serves as an easy-to-use discovery tool for researchers and scientists to hypothesize the structure-function relationship between natural or synthetic variations and their molecular phenotypes.
- MeSH
- Databases, Protein * MeSH
- Genetic Variation MeSH
- Genetic Testing methods MeSH
- Genomics * methods MeSH
- Protein Conformation MeSH
- Humans MeSH
- Proteins genetics chemistry MeSH
- Proteome genetics MeSH
- Amino Acid Sequence MeSH
- Software MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- Proteins MeSH
- Proteome MeSH
RNA secondary (2D) structure visualisation is an essential tool for understanding RNA function. R2DT is a software package designed to visualise RNA 2D structures in consistent, recognisable, and reproducible layouts. The latest release, R2DT 2.0, introduces multiple significant features, including the ability to display position-specific information, such as single nucleotide polymorphisms (SNPs) or SHAPE reactivities. It also offers a new template-free mode allowing visualisation of RNAs without pre-existing templates, alongside a constrained folding mode and support for animated visualisations. Users can interactively modify R2DT diagrams, either manually or using natural language prompts, to generate new templates or create publication-quality images. Additionally, R2DT features faster performance, an expanded template library, and a growing collection of compatible tools and utilities. Already integrated into multiple biological databases, R2DT has evolved into a comprehensive platform for RNA 2D visualisation, accessible at https://r2dt.bio.
- Publication type
- Journal Article MeSH
- Preprint MeSH
We present a novel system that leverages curators in the loop to develop a dataset and model for detecting structure features and functional annotations at residue-level from standard publication text. Our approach involves the integration of data from multiple resources, including PDBe, EuropePMC, PubMedCentral, and PubMed, combined with annotation guidelines from UniProt, and LitSuggest and HuggingFace models as tools in the annotation process. A team of seven annotators manually curated ten articles for named entities, which we utilized to train a starting PubmedBert model from HuggingFace. Using a human-in-the-loop annotation system, we iteratively developed the best model with commendable performance metrics of 0.90 for precision, 0.92 for recall, and 0.91 for F1-measure. Our proposed system showcases a successful synergy of machine learning techniques and human expertise in curating a dataset for residue-level functional annotations and protein structure features. The results demonstrate the potential for broader applications in protein research, bridging the gap between advanced machine learning models and the indispensable insights of domain experts.
- MeSH
- Databases, Protein MeSH
- Humans MeSH
- Proteins * chemistry MeSH
- Machine Learning * MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Dataset MeSH
- Names of Substances
- Proteins * MeSH
Channels, tunnels, and pores serve as pathways for the transport of molecules and ions through protein structures, thus participating to their functions. MOLEonline ( https://mole.upol.cz ) is an interactive web-based tool with enhanced capabilities for detecting and characterizing channels, tunnels, and pores within protein structures. MOLEonline has two distinct calculation modes for analysis of channel and tunnels or transmembrane pores. This application gives researchers rich analytical insights into channel detection, structural characterization, and physicochemical properties. ChannelsDB 2.0 ( https://channelsdb2.biodata.ceitec.cz/ ) is a comprehensive database that offers information on the location, geometry, and physicochemical characteristics of tunnels and pores within macromolecular structures deposited in Protein Data Bank and AlphaFill databases. These tunnels are sourced from manual deposition from literature and automatic detection using software tools MOLE and CAVER. MOLEonline and ChannelsDB visualization is powered by the LiteMol Viewer and Mol* viewer, ensuring a user-friendly workspace. This chapter provides an overview of user applications and usage.
- Keywords
- Biomacromolecule, PDB, Physicochemical properties, Pore, Protein, Residues, Tunnel, Visualization, Voronoi, mmCIF, Channel,
- MeSH
- Databases, Protein * MeSH
- Web Browser MeSH
- Ion Channels metabolism chemistry MeSH
- Protein Conformation MeSH
- Models, Molecular MeSH
- Proteins chemistry metabolism MeSH
- Software * MeSH
- User-Computer Interface MeSH
- Computational Biology methods MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- Ion Channels MeSH
- Proteins MeSH
ChannelsDB 2.0 is an updated database providing structural information about the position, geometry and physicochemical properties of protein channels-tunnels and pores-within deposited biomacromolecular structures from PDB and AlphaFoldDB databases. The newly deposited information originated from several sources. Firstly, we included data calculated using a popular CAVER tool to complement the data obtained using original MOLE tool for detection and analysis of protein tunnels and pores. Secondly, we added tunnels starting from cofactors within the AlphaFill database to enlarge the scope of the database to protein models based on Uniprot. This has enlarged available channel annotations ∼4.6 times as of 1 September 2023. The database stores information about geometrical features, e.g. length and radius, and physico-chemical properties based on channel-lining amino acids. The stored data are interlinked with the available UniProt mutation annotation data. ChannelsDB 2.0 provides an excellent resource for deep analysis of the role of biomacromolecular tunnels and pores. The database is available free of charge: https://channelsdb2.biodata.ceitec.cz.
- MeSH
- Amino Acids MeSH
- Databases, Protein * MeSH
- Protein Conformation MeSH
- Proteins * chemistry MeSH
- Software * MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- Amino Acids MeSH
- Proteins * MeSH
SUMMARY: PDBImages is an innovative, open-source Node.js package that harnesses the power of the popular macromolecule structure visualization software Mol*. Designed for use by the scientific community, PDBImages provides a means to generate high-quality images for PDB and AlphaFold DB models. Its unique ability to render and save images directly to files in a browserless mode sets it apart, offering users a streamlined, automated process for macromolecular structure visualization. Here, we detail the implementation of PDBImages, enumerating its diverse image types, and elaborating on its user-friendly setup. This powerful tool opens a new gateway for researchers to visualize, analyse, and share their work, fostering a deeper understanding of bioinformatics. AVAILABILITY AND IMPLEMENTATION: PDBImages is available as an npm package from https://www.npmjs.com/package/pdb-images. The source code is available from https://github.com/PDBeurope/pdb-images.
- MeSH
- Molecular Structure MeSH
- Software * MeSH
- Computational Biology * methods MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
The mitochondrial ribosome (mitoribosome) has diverged drastically from its evolutionary progenitor, the bacterial ribosome. Structural and compositional diversity is particularly striking in the phylum Euglenozoa, with an extraordinary protein gain in the mitoribosome of kinetoplastid protists. Here we report an even more complex mitoribosome in diplonemids, the sister-group of kinetoplastids. Affinity pulldown of mitoribosomal complexes from Diplonema papillatum, the diplonemid type species, demonstrates that they have a mass of > 5 MDa, contain as many as 130 integral proteins, and exhibit a protein-to-RNA ratio of 11:1. This unusual composition reflects unprecedented structural reduction of ribosomal RNAs, increased size of canonical mitoribosomal proteins, and accretion of three dozen lineage-specific components. In addition, we identified >50 candidate assembly factors, around half of which contribute to early mitoribosome maturation steps. Because little is known about early assembly stages even in model organisms, our investigation of the diplonemid mitoribosome illuminates this process. Together, our results provide a foundation for understanding how runaway evolutionary divergence shapes both biogenesis and function of a complex molecular machine.
- MeSH
- Euglenozoa * classification cytology genetics MeSH
- Eukaryota cytology genetics MeSH
- Mitochondrial Ribosomes * metabolism MeSH
- Ribosomal Proteins metabolism MeSH
- RNA, Ribosomal metabolism MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Names of Substances
- Ribosomal Proteins MeSH
- RNA, Ribosomal MeSH
Segmentation helps interpret imaging data in a biological context. With the development of powerful tools for automated segmentation, public repositories for imaging data have added support for sharing and visualizing segmentations, creating the need for interactive web-based visualization of 3D volume segmentations. To address the ongoing challenge of integrating and visualizing multimodal data, we developed Mol* Volumes and Segmentations (Mol*VS), which enables the interactive, web-based visualization of cellular imaging data supported by macromolecular data and biological annotations. Mol*VS is fully integrated into Mol* Viewer, which is already used for visualization by several public repositories. All EMDB and EMPIAR entries with segmentation datasets are accessible via Mol*VS, which supports the visualization of data from a wide range of electron and light microscopy experiments. Additionally, users can run a local instance of Mol*VS to visualize and share custom datasets in generic or application-specific formats including volumes in .ccp4, .mrc, and .map, and segmentations in EMDB-SFF .hff, Amira .am, iMod .mod, and Segger .seg. Mol*VS is open source and freely available at https://molstarvolseg.ncbr.muni.cz/.
- MeSH
- Internet MeSH
- Macromolecular Substances MeSH
- Microscopy * MeSH
- Image Processing, Computer-Assisted * MeSH
- Software * MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Names of Substances
- Macromolecular Substances MeSH