Nejvíce citovaný článek - PubMed ID 30445541
Recent advances in AI-based methods have revolutionized the field of structural biology. Concomitantly, high-throughput sequencing and functional genomics have generated genetic variants at an unprecedented scale. However, efficient tools and resources are needed to link disparate data types-to 'map' variants onto protein structures, to better understand how the variation causes disease, and thereby design therapeutics. Here we present the Genomics 2 Proteins portal ( https://g2p.broadinstitute.org/ ): a human proteome-wide resource that maps 20,076,998 genetic variants onto 42,413 protein sequences and 77,923 structures, with a comprehensive set of structural and functional features. Additionally, the Genomics 2 Proteins portal allows users to interactively upload protein residue-wise annotations (for example, variants and scores) as well as the protein structure beyond databases to establish the connection between genomics to proteins. The portal serves as an easy-to-use discovery tool for researchers and scientists to hypothesize the structure-function relationship between natural or synthetic variations and their molecular phenotypes.
- MeSH
- databáze proteinů * MeSH
- genetická variace MeSH
- genetické testování metody MeSH
- genomika * metody MeSH
- konformace proteinů MeSH
- lidé MeSH
- proteiny genetika chemie MeSH
- proteom genetika MeSH
- sekvence aminokyselin MeSH
- software MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- proteiny MeSH
- proteom MeSH
SUMMARY: Protein design requires information about how mutations affect protein stability. Many web-based predictors are available for this purpose, yet comparing them or using them en masse is difficult. Here, we present BenchStab, a console tool/Python package for easy and quick execution of 19 predictors and result collection on a list of mutants. Moreover, the tool is easily extensible with additional predictors. We created an independent dataset derived from the FireProtDB and evaluated 24 different prediction methods. AVAILABILITY AND IMPLEMENTATION: BenchStab is an open-source Python package available at https://github.com/loschmidt/BenchStab with a detailed README and example usage at https://loschmidt.chemi.muni.cz/benchstab. The BenchStab dataset is available on Zenodo: https://zenodo.org/records/10637728.
- MeSH
- databáze proteinů MeSH
- internet * MeSH
- proteiny chemie MeSH
- software * MeSH
- stabilita proteinů MeSH
- výpočetní biologie metody MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- proteiny MeSH
Every year, more than 19 million cancer cases are diagnosed, and this number continues to increase annually. Since standard treatment options have varying success rates for different types of cancer, understanding the biology of an individual's tumour becomes crucial, especially for cases that are difficult to treat. Personalised high-throughput profiling, using next-generation sequencing, allows for a comprehensive examination of biopsy specimens. Furthermore, the widespread use of this technology has generated a wealth of information on cancer-specific gene alterations. However, there exists a significant gap between identified alterations and their proven impact on protein function. Here, we present a bioinformatics pipeline that enables fast analysis of a missense mutation's effect on stability and function in known oncogenic proteins. This pipeline is coupled with a predictor that summarises the outputs of different tools used throughout the pipeline, providing a single probability score, achieving a balanced accuracy above 86%. The pipeline incorporates a virtual screening method to suggest potential FDA/EMA-approved drugs to be considered for treatment. We showcase three case studies to demonstrate the timely utility of this pipeline. To facilitate access and analysis of cancer-related mutations, we have packaged the pipeline as a web server, which is freely available at https://loschmidt.chemi.muni.cz/predictonco/ .Scientific contributionThis work presents a novel bioinformatics pipeline that integrates multiple computational tools to predict the effects of missense mutations on proteins of oncological interest. The pipeline uniquely combines fast protein modelling, stability prediction, and evolutionary analysis with virtual drug screening, while offering actionable insights for precision oncology. This comprehensive approach surpasses existing tools by automating the interpretation of mutations and suggesting potential treatments, thereby striving to bridge the gap between sequencing data and clinical application.
- Klíčová slova
- Bioinformatics, Cancer, Function, High-performance computing, Machine learning, Molecular modelling, Oncology, Personalised medicine, Single nucleotide polymorphism, Stability, Treatment,
- Publikační typ
- časopisecké články MeSH
SUMMARY: PDBImages is an innovative, open-source Node.js package that harnesses the power of the popular macromolecule structure visualization software Mol*. Designed for use by the scientific community, PDBImages provides a means to generate high-quality images for PDB and AlphaFold DB models. Its unique ability to render and save images directly to files in a browserless mode sets it apart, offering users a streamlined, automated process for macromolecular structure visualization. Here, we detail the implementation of PDBImages, enumerating its diverse image types, and elaborating on its user-friendly setup. This powerful tool opens a new gateway for researchers to visualize, analyse, and share their work, fostering a deeper understanding of bioinformatics. AVAILABILITY AND IMPLEMENTATION: PDBImages is available as an npm package from https://www.npmjs.com/package/pdb-images. The source code is available from https://github.com/PDBeurope/pdb-images.
- MeSH
- molekulární struktura MeSH
- software * MeSH
- výpočetní biologie * metody MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
Macromolecular complexes are essential functional units in nearly all cellular processes, and their atomic-level understanding is critical for elucidating and modulating molecular mechanisms. The Protein Data Bank (PDB) serves as the global repository for experimentally determined structures of macromolecules. Structural data in the PDB offer valuable insights into the dynamics, conformation, and functional states of biological assemblies. However, the current annotation practices lack standardised naming conventions for assemblies in the PDB, complicating the identification of instances representing the same assembly. In this study, we introduce a method leveraging resources external to PDB, such as the Complex Portal, UniProt and Gene Ontology, to describe assemblies and contextualise them within their biological settings accurately. Employing the proposed approach, we assigned standard names to over 90% of unique assemblies in the PDB and provided persistent identifiers for each assembly. This standardisation of assembly data enhances the PDB, facilitating a deeper understanding of macromolecular complexes. Furthermore, the data standardisation improves the PDB's FAIR attributes, fostering more effective basic and translational research and scientific education.
PredictONCO 1.0 is a unique web server that analyzes effects of mutations on proteins frequently altered in various cancer types. The server can assess the impact of mutations on the protein sequential and structural properties and apply a virtual screening to identify potential inhibitors that could be used as a highly individualized therapeutic approach, possibly based on the drug repurposing. PredictONCO integrates predictive algorithms and state-of-the-art computational tools combined with information from established databases. The user interface was carefully designed for the target specialists in precision oncology, molecular pathology, clinical genetics and clinical sciences. The tool summarizes the effect of the mutation on protein stability and function and currently covers 44 common oncological targets. The binding affinities of Food and Drug Administration/ European Medicines Agency -approved drugs with the wild-type and mutant proteins are calculated to facilitate treatment decisions. The reliability of predictions was confirmed against 108 clinically validated mutations. The server provides a fast and compact output, ideal for the often time-sensitive decision-making process in oncology. Three use cases of missense mutations, (i) K22A in cyclin-dependent kinase 4 identified in melanoma, (ii) E1197K mutation in anaplastic lymphoma kinase 4 identified in lung carcinoma and (iii) V765A mutation in epidermal growth factor receptor in a patient with congenital mismatch repair deficiency highlight how the tool can increase levels of confidence regarding the pathogenicity of the variants and identify the most effective inhibitors. The server is available at https://loschmidt.chemi.muni.cz/predictonco.
- Klíčová slova
- cancer, oncology, personalized medicine, single-nucleotide polymorphism, targeted therapy,
- MeSH
- individualizovaná medicína * MeSH
- lidé MeSH
- melanom * MeSH
- mutace MeSH
- proteiny MeSH
- reprodukovatelnost výsledků MeSH
- strojové učení MeSH
- výpočetní biologie MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- proteiny MeSH
SUMMARY: Understanding the mechanism of action of a protein or designing better ligands for it, often requires access to a bound (holo) and an unbound (apo) state of the protein. Resources for the quick and easy retrieval of such conformations are severely limited. Apo-Holo Juxtaposition (AHoJ), is a web application for retrieving apo-holo structure pairs for user-defined ligands. Given a query structure and one or more user-specified ligands, it retrieves all other structures of the same protein that feature the same binding site(s), aligns them, and examines the superimposed binding sites to determine whether each structure is apo or holo, in reference to the query. The resulting superimposed datasets of apo-holo pairs can be visualized and downloaded for further analysis. AHoJ accepts multiple input queries, allowing the creation of customized apo-holo datasets. AVAILABILITY AND IMPLEMENTATION: Freely available for non-commercial use at http://apoholo.cz. Source code available at https://github.com/cusbg/AHoJ-project. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
- MeSH
- konformace proteinů MeSH
- ligandy MeSH
- proteiny * chemie MeSH
- software * MeSH
- vazebná místa MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- ligandy MeSH
- proteiny * MeSH
The archiving and dissemination of protein and nucleic acid structures as well as their structural, functional and biophysical annotations is an essential task that enables the broader scientific community to conduct impactful research in multiple fields of the life sciences. The Protein Data Bank in Europe (PDBe; pdbe.org) team develops and maintains several databases and web services to address this fundamental need. From data archiving as a member of the Worldwide PDB consortium (wwPDB; wwpdb.org), to the PDBe Knowledge Base (PDBe-KB; pdbekb.org), we provide data, data-access mechanisms, and visualizations that facilitate basic and applied research and education across the life sciences. Here, we provide an overview of the structural data and annotations that we integrate and make freely available. We describe the web services and data visualization tools we offer, and provide information on how to effectively use or even further develop them. Finally, we discuss the direction of our data services, and how we aim to tackle new challenges that arise from the recent, unprecedented advances in the field of structure determination and protein structure modeling.
- Klíčová slova
- bioinformatics, databases, protein data bank, structural biology,
- MeSH
- databáze proteinů MeSH
- konformace proteinů MeSH
- nukleové kyseliny * MeSH
- proteiny * chemie MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Geografické názvy
- Evropa MeSH
- Názvy látek
- nukleové kyseliny * MeSH
- proteiny * MeSH
In this review, we describe the creation of the Nucleic Acid Database (NDB) at Rutgers University and how it became a testbed for the current infrastructure of the RCSB Protein Data Bank. We describe some of the special features of the NDB and how it has been used to enable research. Plans for the next phase as the Nucleic Acid Knowledgebase (NAKB) are summarized.
- Klíčová slova
- DNA, RNA, biological structure database, nucleic acid conformation, nucleic acid structures, validation standards,
- Publikační typ
- časopisecké články MeSH
- přehledy MeSH
This article examines the presence of the empirical tendency known as the Menzerath-Altmann Law (MAL) on protein secondary structures. MAL is related to optimization principles observed in natural languages and in genetic information on chromosomes or protein domains. The presence of MAL is examined on a non-redundant dataset of 4728 proteins by verifying significant, negative correlations and testing classical and newly proposed formulas by fitting the observed trend. We conclude that the lengths of secondary structures are specifically dependent on their number inside the protein sequence, while possibly reflecting the formula proposed in this paper. This behavior is observed on average but is individually avoidable and possibly driven by a latent cost function. The data suggest that MAL could provide a useful guiding principle in protein design.
- Klíčová slova
- Menzerath–Altmann law, empirical law, formula fitting, proteins, quantitative linguistics, secondary structures,
- MeSH
- algoritmy MeSH
- databáze proteinů MeSH
- molekulární modely * MeSH
- proteiny chemie MeSH
- sekundární struktura proteinů MeSH
- statistika jako téma MeSH
- subcelulární frakce metabolismus MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- proteiny MeSH