One of main steps in a study of microbial communities is resolving their composition, diversity and function. In the past, these issues were mostly addressed by the use of amplicon sequencing of a target gene because of reasonable price and easier computational postprocessing of the bioinformatic data. With the advancement of sequencing techniques, the main focus shifted to the whole metagenome shotgun sequencing, which allows much more detailed analysis of the metagenomic data, including reconstruction of novel microbial genomes and to gain knowledge about genetic potential and metabolic capacities of whole environments. On the other hand, the output of whole metagenomic shotgun sequencing is mixture of short DNA fragments belonging to various genomes, therefore this approach requires more sophisticated computational algorithms for clustering of related sequences, commonly referred to as sequence binning. There are currently two types of binning methods: taxonomy dependent and taxonomy independent. The first type classifies the DNA fragments by performing a standard homology inference against a reference database, while the latter performs the reference-free binning by applying clustering techniques on features extracted from the sequences. In this review, we describe the strategies within the second approach. Although these strategies do not require prior knowledge, they have higher demands on the length of sequences. Besides their basic principle, an overview of particular methods and tools is provided. Furthermore, the review covers the utilization of the methods in context with the length of sequences and discusses the needs for metagenomic data preprocessing in form of initial assembly prior to binning.
- Keywords
- Abundance, Genomic signature, Metagenomics, Sequence binning, Taxonomy independent, Visualization,
- Publication type
- Journal Article MeSH
- Review MeSH
SUMMARY: PDBImages is an innovative, open-source Node.js package that harnesses the power of the popular macromolecule structure visualization software Mol*. Designed for use by the scientific community, PDBImages provides a means to generate high-quality images for PDB and AlphaFold DB models. Its unique ability to render and save images directly to files in a browserless mode sets it apart, offering users a streamlined, automated process for macromolecular structure visualization. Here, we detail the implementation of PDBImages, enumerating its diverse image types, and elaborating on its user-friendly setup. This powerful tool opens a new gateway for researchers to visualize, analyse, and share their work, fostering a deeper understanding of bioinformatics. AVAILABILITY AND IMPLEMENTATION: PDBImages is available as an npm package from https://www.npmjs.com/package/pdb-images. The source code is available from https://github.com/PDBeurope/pdb-images.
- MeSH
- Molecular Structure MeSH
- Software * MeSH
- Computational Biology * methods MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
SUMMARY: MolArt fills the gap between sequence and structure visualization by providing a light-weight, interactive environment enabling exploration of sequence annotations in the context of available experimental or predicted protein structures. Provided a UniProt ID, MolArt downloads and displays sequence annotations, sequence-structure mapping and relevant structures. The sequence and structure views are interlinked, enabling sequence annotations being color overlaid over the mapped structures, thus providing an enhanced understanding and interpretation of the available molecular data. AVAILABILITY AND IMPLEMENTATION: MolArt is released under the Apache 2 license and is available at https://github.com/davidhoksza/MolArt. The project web page https://davidhoksza.github.io/MolArt/ features examples and applications of the tool.
- MeSH
- Color MeSH
- Protein Conformation * MeSH
- Molecular Structure * MeSH
- Proteins * MeSH
- Software * MeSH
- Computational Biology MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- Proteins * MeSH
Visualization analysis plays an important role in metagenomics research. Proper and clear visualization can help researchers get their first insights into data and by selecting different features, also revealing and highlighting hidden relationships and drawing conclusions. To prevent the resulting presentations from becoming chaotic, visualization techniques have to properly tackle the high dimensionality of microbiome data. Although a number of different methods based on dimensionality reduction, correlations, Venn diagrams, and network representations have already been published, there is still room for further improvement, especially in the techniques that allow visual comparison of several environments or developmental stages in one environment. In this article, we represent microbiome data by bipartite graphs, where one partition stands for taxa and the other stands for samples. We demonstrated that community detection is independent of taxonomical level. Moreover, focusing on higher taxonomical levels and the appropriate merging of samples greatly helps improving graph organization and makes our presentations clearer than other graph and network visualizations. Capturing labels in the vertices also brings the possibility of clearly comparing two or more microbial communities by showing their common and unique parts.
- Keywords
- 16S rRNA, OTU table, bipartite graph, graph modularity, metagenomics, visualization analysis,
- Publication type
- Journal Article MeSH
BACKGROUND: Protein function is determined by many factors, namely by its constitution, spatial arrangement, and dynamic behavior. Studying these factors helps the biochemists and biologists to better understand the protein behavior and to design proteins with modified properties. One of the most common approaches to these studies is to compare the protein structure with other molecules and to reveal similarities and differences in their polypeptide chains. RESULTS: We support the comparison process by proposing a new visualization technique that bridges the gap between traditionally used 1D and 3D representations. By introducing the information about mutual positions of protein chains into the 1D sequential representation the users are able to observe the spatial differences between the proteins without any occlusion commonly present in 3D view. Our representation is designed to serve namely for comparison of multiple proteins or a set of time steps of molecular dynamics simulation. CONCLUSIONS: The novel representation is demonstrated on two usage scenarios. The first scenario aims to compare a set of proteins from the family of cytochromes P450 where the position of the secondary structures has a significant impact on the substrate channeling. The second scenario focuses on the protein flexibility when by comparing a set of time steps our representation helps to reveal the most dynamically changing parts of the protein chain.
- Keywords
- Molecular sequence analysis, Molecular structure and function, Molecular visualization,
- MeSH
- Algorithms MeSH
- Models, Molecular MeSH
- Proteins chemistry MeSH
- Protein Structure, Secondary * MeSH
- Amino Acid Sequence MeSH
- Sequence Alignment MeSH
- Molecular Dynamics Simulation * MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- Proteins MeSH
BACKGROUND: Visualization of RNA secondary structures is a complex task, and, especially in the case of large RNA structures where the expected layout is largely habitual, the existing visualization tools often fail to produce suitable visualizations. This led us to the idea to use existing layouts as templates for the visualization of new RNAs similarly to how templates are used in homology-based structure prediction. RESULTS: This article introduces Traveler, a software tool enabling visualization of a target RNA secondary structure using an existing layout of a sufficiently similar RNA structure as a template. Traveler is based on an algorithm which converts the target and template structures into corresponding tree representations and utilizes tree edit distance coupled with layout modification operations to transform the template layout into the target one. Traveler thus accepts a pair of secondary structures and a template layout and outputs a layout for the target structure. CONCLUSIONS: Traveler is a command-line open source tool able to quickly generate layouts for even the largest RNA structures in the presence of a sufficiently similar layout. It is available at http://github.com/davidhoksza/traveler .
- Keywords
- RNA secondary structure, Software tool, Template-based modeling, Visualization,
- MeSH
- Algorithms MeSH
- Nucleic Acid Conformation MeSH
- RNA chemistry MeSH
- Software * MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- RNA MeSH
Channels, tunnels, and pores serve as pathways for the transport of molecules and ions through protein structures, thus participating to their functions. MOLEonline ( https://mole.upol.cz ) is an interactive web-based tool with enhanced capabilities for detecting and characterizing channels, tunnels, and pores within protein structures. MOLEonline has two distinct calculation modes for analysis of channel and tunnels or transmembrane pores. This application gives researchers rich analytical insights into channel detection, structural characterization, and physicochemical properties. ChannelsDB 2.0 ( https://channelsdb2.biodata.ceitec.cz/ ) is a comprehensive database that offers information on the location, geometry, and physicochemical characteristics of tunnels and pores within macromolecular structures deposited in Protein Data Bank and AlphaFill databases. These tunnels are sourced from manual deposition from literature and automatic detection using software tools MOLE and CAVER. MOLEonline and ChannelsDB visualization is powered by the LiteMol Viewer and Mol* viewer, ensuring a user-friendly workspace. This chapter provides an overview of user applications and usage.
- Keywords
- Biomacromolecule, PDB, Physicochemical properties, Pore, Protein, Residues, Tunnel, Visualization, Voronoi, mmCIF, Channel,
- MeSH
- Databases, Protein * MeSH
- Web Browser MeSH
- Ion Channels metabolism chemistry MeSH
- Protein Conformation MeSH
- Models, Molecular MeSH
- Proteins chemistry metabolism MeSH
- Software * MeSH
- User-Computer Interface MeSH
- Computational Biology methods MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- Ion Channels MeSH
- Proteins MeSH
UNLABELLED: The transport of ligands, ions or solvent molecules into proteins with buried binding sites or through the membrane is enabled by protein tunnels and channels. CAVER Analyst is a software tool for calculation, analysis and real-time visualization of access tunnels and channels in static and dynamic protein structures. It provides an intuitive graphic user interface for setting up the calculation and interactive exploration of identified tunnels/channels and their characteristics. AVAILABILITY AND IMPLEMENTATION: CAVER Analyst is a multi-platform software written in JAVA. Binaries and documentation are freely available for non-commercial use at http://www.caver.cz.
- MeSH
- Ligands MeSH
- Computer Graphics * MeSH
- Proteins chemistry metabolism MeSH
- Software * MeSH
- User-Computer Interface MeSH
- Binding Sites MeSH
- Computational Biology methods MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Names of Substances
- Ligands MeSH
- Proteins MeSH
Protein structure determines biological function. Accurately conceptualizing 3D protein/ligand structures is thus vital to scientific research and education. Virtual reality (VR) enables protein visualization in stereoscopic 3D, but many VR molecular-visualization programs are expensive and challenging to use; work only on specific VR headsets; rely on complicated model-preparation software; and/or require the user to install separate programs or plugins. Here we introduce ProteinVR, a web-based application that works on various VR setups and operating systems. ProteinVR displays molecular structures within 3D environments that give useful biological context and allow users to situate themselves in 3D space. Our web-based implementation is ideal for hypothesis generation and education in research and large-classroom settings. We release ProteinVR under the open-source BSD-3-Clause license. A copy of the program is available free of charge from http://durrantlab.com/protein-vr/, and a working version can be accessed at http://durrantlab.com/pvr/.
- MeSH
- Internet * MeSH
- Protein Conformation MeSH
- Proteins * chemistry ultrastructure MeSH
- Virtual Reality * MeSH
- Computational Biology methods MeSH
- Imaging, Three-Dimensional methods MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Research Support, N.I.H., Extramural MeSH
- Names of Substances
- Proteins * MeSH
With the advent of OMICs technologies, both individual research groups and consortia have spear-headed the characterization of human samples of multiple pathophysiologic origins, resulting in thousands of archived genomes and transcriptomes. Although a variety of web tools are now available to extract information from OMICs data, their utility has been limited by the capacity of nonbioinformatician researchers to exploit the information. To address this problem, we have developed CANCERTOOL, a web-based interface that aims to overcome the major limitations of public transcriptomics dataset analysis for highly prevalent types of cancer (breast, prostate, lung, and colorectal). CANCERTOOL provides rapid and comprehensive visualization of gene expression data for the gene(s) of interest in well-annotated cancer datasets. This visualization is accompanied by generation of reports customized to the interest of the researcher (e.g., editable figures, detailed statistical analyses, and access to raw data for reanalysis). It also carries out gene-to-gene correlations in multiple datasets at the same time or using preset patient groups. Finally, this new tool solves the time-consuming task of performing functional enrichment analysis with gene sets of interest using up to 11 different databases at the same time. Collectively, CANCERTOOL represents a simple and freely accessible interface to interrogate well-annotated datasets and obtain publishable representations that can contribute to refinement and guidance of cancer-related investigations at all levels of hypotheses and design.Significance: In order to facilitate access of research groups without bioinformatics support to public transcriptomics data, we have developed a free online tool with an easy-to-use interface that allows researchers to obtain quality information in a readily publishable format. Cancer Res; 78(21); 6320-8. ©2018 AACR.
- MeSH
- Algorithms MeSH
- Databases, Factual MeSH
- Databases, Genetic MeSH
- Genomics MeSH
- Internet MeSH
- Medical Oncology MeSH
- Humans MeSH
- Neoplasms genetics MeSH
- Computer Graphics MeSH
- Proteomics MeSH
- Workflow MeSH
- Software MeSH
- Transcriptome MeSH
- User-Computer Interface MeSH
- Computational Biology methods MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH