JavaScript is NOT enabled !

Please enable JavaScript.

* Show help

Reset

Most cited: 30109435

8 citations in PubMed Filters

Most cited article - PubMed ID 30109435

P2Rank: machine learning based tool for rapid and accurate prediction of ligand binding sites from protein structure

Journal of cheminformatics. 2018 Aug 14 ; 10 (1) : 39. [epub] 20180814

J Cheminform
ISSN 1758-2946
Source

Article

CryptoBench: cryptic protein-ligand binding sites dataset and benchmark

Škrhák, Vít
Author Škrhák, Vít ORCID Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, 118 00 Prague, Czech Republic
Novotný, Marian
Author Novotný, Marian ORCID Department of Cell Biology, Faculty of Science, Charles University, 128 43 Prague, Czech Republic
Feidakis, Christos P
Author Feidakis, Christos P ORCID Department of Cell Biology, Faculty of Science, Charles University, 128 43 Prague, Czech Republic
Krivák, Radoslav
Author Krivák, Radoslav ORCID Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, 118 00 Prague, Czech Republic Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, 160 00 Prague, Czech Republic
Hoksza, David
Author Hoksza, David ORCID Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, 118 00 Prague, Czech Republic

Bioinformatics (Oxford, England). 2024 Dec 26 ; 41 (1) : .

Bioinformatics
ISSN 1367-4811 | 1367-4803
Source

MOTIVATION: Structure-based methods for detecting protein-ligand binding sites play a crucial role in various domains, from fundamental research to biomedical applications. However, current prediction methodologies often rely on holo (ligand-bound) protein conformations for training and evaluation, overlooking the significance of the apo (ligand-free) states. This oversight is particularly problematic in the case of cryptic binding sites (CBSs) where holo-based assessment yields unrealistic performance expectations. RESULTS: To advance the development in this domain, we introduce CryptoBench, a benchmark dataset tailored for training and evaluating novel CBS prediction methodologies. CryptoBench is constructed upon a large collection of apo-holo protein pairs, grouped by UniProtID, clustered by sequence identity, and filtered to contain only structures with substantial structural change in the binding site. CryptoBench comprises 1107 structures with predefined cross-validation splits, making it the most extensive CBS dataset to date. To establish a performance baseline, we measured the predictive power of sequence- and structure-based CBS residue prediction methods using the benchmark. We selected PocketMiner as the state-of-the-art representative of the structure-based methods for CBS detection, and P2Rank, a widely-used structure-based method for general binding site prediction that is not specifically tailored for cryptic sites. For sequence-based approaches, we trained a neural network to classify binding residues using protein language model embeddings. Our sequence-based approach outperformed PocketMiner and P2Rank across key metrics, including area under the curve, area under the precision-recall curve, Matthew's correlation coefficient, and F1 scores. These results provide baseline benchmark results for future CBS and potentially also non-CBS prediction endeavors, leveraging CryptoBench as the foundational platform for further advancements in the field. AVAILABILITY AND IMPLEMENTATION: The CryptoBench dataset, including the benchmark model, is available on Open Science Framework-https://osf.io/pz4a9/. The code and tutorial are available at the GitHub repository-https://github.com/skrhakv/CryptoBench/.

Article

Analysis of mutations in precision oncology using the automated, accurate, and user-friendly web tool PredictONCO

Computational and structural biotechnology journal. 2024 Dec ; 24 () : 734-738. [epub] 20241114

Comput Struct Biotechnol J
ISSN 2001-0370
Source

Next-generation sequencing technology has created many new opportunities for clinical diagnostics, but it faces the challenge of functional annotation of identified mutations. Various algorithms have been developed to predict the impact of missense variants that influence oncogenic drivers. However, computational pipelines that handle biological data must integrate multiple software tools, which can add complexity and hinder non-specialist users from accessing the pipeline. Here, we have developed an online user-friendly web server tool PredictONCO that is fully automated and has a low barrier to access. The tool models the structure of the mutant protein in the first step. Next, it calculates the protein stability change, pocket level information, evolutionary conservation, and changes in ionisation of catalytic amino acid residues, and uses them as the features in the machine-learning predictor. The XGBoost-based predictor was validated on an independent subset of held-out data, demonstrating areas under the receiver operating characteristic curve (ROC) of 0.97 and 0.94, and the average precision from the precision-recall curve of 0.99 and 0.94 for structure-based and sequence-based predictions, respectively. Finally, PredictONCO calculates the docking results of small molecules approved by regulatory authorities. We demonstrate the applicability of the tool by presenting its usage for variants in two cancer-associated proteins, cellular tumour antigen p53 and fibroblast growth factor receptor FGFR1. Our free web tool will assist with the interpretation of data from next-generation sequencing and navigate treatment strategies in clinical oncology: https://loschmidt.chemi.muni.cz/predictonco/.

Keywords
Automation, Machine learning, Mutation, Next-generation sequencing, Oncogenicity, Precision oncology, Prediction, Treatment, Virtual screening, Webserver,
Publication type
Journal Article MeSH

Article

Large-scale annotation of biochemically relevant pockets and tunnels in cognate enzyme-ligand complexes

Journal of cheminformatics. 2024 Oct 15 ; 16 (1) : 114. [epub] 20241015

J Cheminform
ISSN 1758-2946
Source

Tunnels in enzymes with buried active sites are key structural features allowing the entry of substrates and the release of products, thus contributing to the catalytic efficiency. Targeting the bottlenecks of protein tunnels is also a powerful protein engineering strategy. However, the identification of functional tunnels in multiple protein structures is a non-trivial task that can only be addressed computationally. We present a pipeline integrating automated structural analysis with an in-house machine-learning predictor for the annotation of protein pockets, followed by the calculation of the energetics of ligand transport via biochemically relevant tunnels. A thorough validation using eight distinct molecular systems revealed that CaverDock analysis of ligand un/binding is on par with time-consuming molecular dynamics simulations, but much faster. The optimized and validated pipeline was applied to annotate more than 17,000 cognate enzyme-ligand complexes. Analysis of ligand un/binding energetics indicates that the top priority tunnel has the most favourable energies in 75% of cases. Moreover, energy profiles of cognate ligands revealed that a simple geometry analysis can correctly identify tunnel bottlenecks only in 50% of cases. Our study provides essential information for the interpretation of results from tunnel calculation and energy profiling in mechanistic enzymology and protein engineering. We formulated several simple rules allowing identification of biochemically relevant tunnels based on the binding pockets, tunnel geometry, and ligand transport energy profiles.Scientific contributionsThe pipeline introduced in this work allows for the detailed analysis of a large set of protein-ligand complexes, focusing on transport pathways. We are introducing a novel predictor for determining the relevance of binding pockets for tunnel calculation. For the first time in the field, we present a high-throughput energetic analysis of ligand binding and unbinding, showing that approximate methods for these simulations can identify additional mutagenesis hotspots in enzymes compared to purely geometrical methods. The predictor is included in the supplementary material and can also be accessed at https://github.com/Faranehhad/Large-Scale-Pocket-Tunnel-Annotation.git . The tunnel data calculated in this study has been made publicly available as part of the ChannelsDB 2.0 database, accessible at https://channelsdb2.biodata.ceitec.cz/ .

Keywords
Bottleneck, Cavity, Cognate ligand, Enzyme, Machine learning, Pocket, Transport, Tunnel,
Publication type
Journal Article MeSH

Article

A computational workflow for analysis of missense mutations in precision oncology

Journal of cheminformatics. 2024 Jul 29 ; 16 (1) : 86. [epub] 20240729

J Cheminform
ISSN 1758-2946
Source

Every year, more than 19 million cancer cases are diagnosed, and this number continues to increase annually. Since standard treatment options have varying success rates for different types of cancer, understanding the biology of an individual's tumour becomes crucial, especially for cases that are difficult to treat. Personalised high-throughput profiling, using next-generation sequencing, allows for a comprehensive examination of biopsy specimens. Furthermore, the widespread use of this technology has generated a wealth of information on cancer-specific gene alterations. However, there exists a significant gap between identified alterations and their proven impact on protein function. Here, we present a bioinformatics pipeline that enables fast analysis of a missense mutation's effect on stability and function in known oncogenic proteins. This pipeline is coupled with a predictor that summarises the outputs of different tools used throughout the pipeline, providing a single probability score, achieving a balanced accuracy above 86%. The pipeline incorporates a virtual screening method to suggest potential FDA/EMA-approved drugs to be considered for treatment. We showcase three case studies to demonstrate the timely utility of this pipeline. To facilitate access and analysis of cancer-related mutations, we have packaged the pipeline as a web server, which is freely available at https://loschmidt.chemi.muni.cz/predictonco/ .Scientific contributionThis work presents a novel bioinformatics pipeline that integrates multiple computational tools to predict the effects of missense mutations on proteins of oncological interest. The pipeline uniquely combines fast protein modelling, stability prediction, and evolutionary analysis with virtual drug screening, while offering actionable insights for precision oncology. This comprehensive approach surpasses existing tools by automating the interpretation of mutations and suggesting potential treatments, thereby striving to bridge the gap between sequencing data and clinical application.

Keywords
Bioinformatics, Cancer, Function, High-performance computing, Machine learning, Molecular modelling, Oncology, Personalised medicine, Single nucleotide polymorphism, Stability, Treatment,
Publication type
Journal Article MeSH

Article

Installation of LYRM proteins in early eukaryotes to regulate the metabolic capacity of the emerging mitochondrion

Open biology. 2024 May ; 14 (5) : 240021. [epub] 20240522

Open Biol
ISSN 2046-2441
Source

Core mitochondrial processes such as the electron transport chain, protein translation and the formation of Fe-S clusters (ISC) are of prokaryotic origin and were present in the bacterial ancestor of mitochondria. In animal and fungal models, a family of small Leu-Tyr-Arg motif-containing proteins (LYRMs) uniformly regulates the function of mitochondrial complexes involved in these processes. The action of LYRMs is contingent upon their binding to the acylated form of acyl carrier protein (ACP). This study demonstrates that LYRMs are structurally and evolutionarily related proteins characterized by a core triplet of α-helices. Their widespread distribution across eukaryotes suggests that 12 specialized LYRMs were likely present in the last eukaryotic common ancestor to regulate the assembly and folding of the subunits that are conserved in bacteria but that lack LYRM homologues. The secondary reduction of mitochondria to anoxic environments has rendered the function of LYRMs and their interaction with acylated ACP dispensable. Consequently, these findings strongly suggest that early eukaryotes installed LYRMs in aerobic mitochondria as orchestrated switches, essential for regulating core metabolism and ATP production.

Keywords
LECA, LYRM proteins, acyl-ACP, mitochondrial evolution,
MeSH
Eukaryota metabolism MeSH
Phylogeny MeSH
Humans MeSH
Mitochondrial Proteins * metabolism genetics MeSH
Mitochondria * metabolism MeSH
Evolution, Molecular MeSH
Models, Molecular MeSH
Acyl Carrier Protein metabolism genetics MeSH
Amino Acid Sequence MeSH
Animals MeSH
Check Tag
Humans MeSH
Animals MeSH
Publication type
Journal Article MeSH

Article

PredictONCO: a web tool supporting decision-making in precision oncology by extending the bioinformatics predictions with advanced computing and machine learning

Briefings in bioinformatics. 2023 Nov 22 ; 25 (1) : .

Brief Bioinform
ISSN 1477-4054 | 1467-5463
Source

PredictONCO 1.0 is a unique web server that analyzes effects of mutations on proteins frequently altered in various cancer types. The server can assess the impact of mutations on the protein sequential and structural properties and apply a virtual screening to identify potential inhibitors that could be used as a highly individualized therapeutic approach, possibly based on the drug repurposing. PredictONCO integrates predictive algorithms and state-of-the-art computational tools combined with information from established databases. The user interface was carefully designed for the target specialists in precision oncology, molecular pathology, clinical genetics and clinical sciences. The tool summarizes the effect of the mutation on protein stability and function and currently covers 44 common oncological targets. The binding affinities of Food and Drug Administration/ European Medicines Agency -approved drugs with the wild-type and mutant proteins are calculated to facilitate treatment decisions. The reliability of predictions was confirmed against 108 clinically validated mutations. The server provides a fast and compact output, ideal for the often time-sensitive decision-making process in oncology. Three use cases of missense mutations, (i) K22A in cyclin-dependent kinase 4 identified in melanoma, (ii) E1197K mutation in anaplastic lymphoma kinase 4 identified in lung carcinoma and (iii) V765A mutation in epidermal growth factor receptor in a patient with congenital mismatch repair deficiency highlight how the tool can increase levels of confidence regarding the pathogenicity of the variants and identify the most effective inhibitors. The server is available at https://loschmidt.chemi.muni.cz/predictonco.

Keywords
cancer, oncology, personalized medicine, single-nucleotide polymorphism, targeted therapy,
MeSH
Precision Medicine * MeSH
Humans MeSH
Melanoma * MeSH
Mutation MeSH
Proteins MeSH
Reproducibility of Results MeSH
Machine Learning MeSH
Computational Biology MeSH
Check Tag
Humans MeSH
Publication type
Journal Article MeSH
Research Support, Non-U.S. Gov't MeSH
Names of Substances
Proteins MeSH

Article

PrankWeb 3: accelerated ligand-binding site predictions for experimental and modelled protein structures

Nucleic acids research. 2022 Jul 05 ; 50 (W1) : W593-W597.

Nucleic Acids Res
ISSN 1362-4962 | 0305-1048
Source

Knowledge of protein-ligand binding sites (LBSs) enables research ranging from protein function annotation to structure-based drug design. To this end, we have previously developed a stand-alone tool, P2Rank, and the web server PrankWeb (https://prankweb.cz/) for fast and accurate LBS prediction. Here, we present significant enhancements to PrankWeb. First, a new, more accurate evolutionary conservation estimation pipeline based on the UniRef50 sequence database and the HMMER3 package is introduced. Second, PrankWeb now allows users to enter UniProt ID to carry out LBS predictions in situations where no experimental structure is available by utilizing the AlphaFold model database. Additionally, a range of minor improvements has been implemented. These include the ability to deploy PrankWeb and P2Rank as Docker containers, support for the mmCIF file format, improved public REST API access, or the ability to batch download the LBS predictions for the whole PDB archive and parts of the AlphaFold database.

Article

PrankWeb: a web server for ligand binding site prediction and visualization

Nucleic acids research. 2019 Jul 02 ; 47 (W1) : W345-W349.

Nucleic Acids Res
ISSN 1362-4962 | 0305-1048
Source

PrankWeb is an online resource providing an interface to P2Rank, a state-of-the-art method for ligand binding site prediction. P2Rank is a template-free machine learning method based on the prediction of local chemical neighborhood ligandability centered on points placed on a solvent-accessible protein surface. Points with a high ligandability score are then clustered to form the resulting ligand binding sites. In addition, PrankWeb provides a web interface enabling users to easily carry out the prediction and visually inspect the predicted binding sites via an integrated sequence-structure view. Moreover, PrankWeb can determine sequence conservation for the input molecule and use this in both the prediction and result visualization steps. Alongside its online visualization options, PrankWeb also offers the possibility of exporting the results as a PyMOL script for offline visualization. The web frontend communicates with the server side via a REST API. In high-throughput scenarios, therefore, users can utilize the server API directly, bypassing the need for a web-based frontend or installation of the P2Rank application. PrankWeb is available at http://prankweb.cz/, while the web application source code and the P2Rank method can be accessed at https://github.com/jendelel/PrankWebApp and https://github.com/rdk/p2rank, respectively.

MeSH
Benchmarking MeSH
Datasets as Topic MeSH
Protein Interaction Domains and Motifs MeSH
Internet MeSH
Protein Conformation, alpha-Helical MeSH
Protein Conformation, beta-Strand MeSH
Humans MeSH
Ligands MeSH
Proteins chemistry metabolism MeSH
Amino Acid Sequence MeSH
Software * MeSH
Machine Learning * MeSH
Thermodynamics MeSH
Protein Binding MeSH
Binding Sites MeSH
Check Tag
Humans MeSH
Publication type
Journal Article MeSH
Research Support, Non-U.S. Gov't MeSH
Names of Substances
Ligands MeSH
Proteins MeSH

* Show help

P2Rank: machine learning based tool for rapid and accurate prediction of ligand binding sites from protein structure

Refine by MeSH