Most cited article - PubMed ID 29931189
3DPatch: fast 3D structure visualization with residue conservation
MOTIVATION: Structure-based methods for detecting protein-ligand binding sites play a crucial role in various domains, from fundamental research to biomedical applications. However, current prediction methodologies often rely on holo (ligand-bound) protein conformations for training and evaluation, overlooking the significance of the apo (ligand-free) states. This oversight is particularly problematic in the case of cryptic binding sites (CBSs) where holo-based assessment yields unrealistic performance expectations. RESULTS: To advance the development in this domain, we introduce CryptoBench, a benchmark dataset tailored for training and evaluating novel CBS prediction methodologies. CryptoBench is constructed upon a large collection of apo-holo protein pairs, grouped by UniProtID, clustered by sequence identity, and filtered to contain only structures with substantial structural change in the binding site. CryptoBench comprises 1107 structures with predefined cross-validation splits, making it the most extensive CBS dataset to date. To establish a performance baseline, we measured the predictive power of sequence- and structure-based CBS residue prediction methods using the benchmark. We selected PocketMiner as the state-of-the-art representative of the structure-based methods for CBS detection, and P2Rank, a widely-used structure-based method for general binding site prediction that is not specifically tailored for cryptic sites. For sequence-based approaches, we trained a neural network to classify binding residues using protein language model embeddings. Our sequence-based approach outperformed PocketMiner and P2Rank across key metrics, including area under the curve, area under the precision-recall curve, Matthew's correlation coefficient, and F1 scores. These results provide baseline benchmark results for future CBS and potentially also non-CBS prediction endeavors, leveraging CryptoBench as the foundational platform for further advancements in the field. AVAILABILITY AND IMPLEMENTATION: The CryptoBench dataset, including the benchmark model, is available on Open Science Framework-https://osf.io/pz4a9/. The code and tutorial are available at the GitHub repository-https://github.com/skrhakv/CryptoBench/.
- MeSH
- Benchmarking MeSH
- Databases, Protein MeSH
- Protein Conformation MeSH
- Ligands MeSH
- Proteins * chemistry metabolism MeSH
- Software * MeSH
- Protein Binding MeSH
- Binding Sites MeSH
- Computational Biology * methods MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- Ligands MeSH
- Proteins * MeSH
Interactions among amino acid residues are the principal contributor to the stability of the three-dimensional structure of a protein. The Amino Acid Interactions (INTAA) web server (https://bioinfo.uochb.cas.cz/INTAA/) has established itself as a unique computational resource, which enables users to calculate the contribution of individual residues in a biomolecular structure to its total energy using a molecular mechanical scoring function. In this update, we describe major additions to the web server which help solidify its position as a robust, comprehensive resource for biomolecular structure analysis. Importantly, a new continuum solvation model was introduced, allowing more accurate representation of electrostatic interactions in aqueous media. In addition, a low-overhead pipeline for the estimation of evolutionary conservation in protein chains has been added. New visualization options were introduced as well, allowing users to easily switch between and interrelate the energetic and evolutionary views of the investigated structures.
- MeSH
- Amino Acids chemistry MeSH
- Internet MeSH
- Protein Conformation * MeSH
- Models, Molecular MeSH
- Proteins chemistry MeSH
- Software * MeSH
- Static Electricity MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Names of Substances
- Amino Acids MeSH
- Proteins MeSH