AHoJ-DB: A PDB-wide Assignment of apo & holo Relationships Based on Individual Protein-Ligand Interactions
Language English Country Netherlands Media print-electronic
Document type Journal Article
PubMed
38508305
DOI
10.1016/j.jmb.2024.168545
PII: S0022-2836(24)00140-2
Knihovny.cz E-resources
- Keywords
- Apo-holo, binding sites, drug design, ligands, protein structure,
- MeSH
- Apoproteins chemistry metabolism MeSH
- Databases, Protein * MeSH
- Protein Conformation * MeSH
- Humans MeSH
- Ligands MeSH
- Models, Molecular MeSH
- Proteins * chemistry metabolism MeSH
- Protein Binding * MeSH
- Binding Sites MeSH
- Computational Biology methods MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- Apoproteins MeSH
- Ligands MeSH
- Proteins * MeSH
A single protein structure is rarely sufficient to capture the conformational variability of a protein. Both bound and unbound (holo and apo) forms of a protein are essential for understanding its geometry and making meaningful comparisons. Nevertheless, docking or drug design studies often still consider only single protein structures in their holo form, which are for the most part rigid. With the recent explosion in the field of structural biology, large, curated datasets are urgently needed. Here, we use a previously developed application (AHoJ) to perform a comprehensive search for apo-holo pairs for 468,293 biologically relevant protein-ligand interactions across 27,983 proteins. In each search, the binding pocket is captured and mapped across existing structures within the same UniProt, and the mapped pockets are annotated as apo or holo, based on the presence or absence of ligands. We assemble the results into a database, AHoJ-DB (www.apoholo.cz/db), that captures the variability of proteins with identical sequences, thereby exposing the agents responsible for the observed differences in geometry. We report several metrics for each annotated pocket, and we also include binding pockets that form at the interface of multiple chains. Analysis of the database shows that about 24% of the binding sites occur at the interface of two or more chains and that less than 50% of the total binding sites processed have an apo form in the PDB. These results can be used to train and evaluate predictors, discover potentially druggable proteins, and reveal protein- and ligand-specific relationships that were previously obscured by intermittent or partial data. Availability: www.apoholo.cz/db.
References provided by Crossref.org
CryptoBench: cryptic protein-ligand binding sites dataset and benchmark