The acid dissociation constant is an important molecular property, and it can be successfully predicted by Quantitative Structure-Property Relationship (QSPR) models, even for in silico designed molecules. We analyzed how the methodology of in silico 3D structure preparation influences the quality of QSPR models. Specifically, we evaluated and compared QSPR models based on six different 3D structure sources (DTP NCI, Pubchem, Balloon, Frog2, OpenBabel, and RDKit) combined with four different types of optimization. These analyses were performed for three classes of molecules (phenols, carboxylic acids, anilines), and the QSPR model descriptors were quantum mechanical (QM) and empirical partial atomic charges. Specifically, we developed 516 QSPR models and afterward systematically analyzed the influence of the 3D structure source and other factors on their quality. Our results confirmed that QSPR models based on partial atomic charges are able to predict pKa with high accuracy. We also confirmed that ab initio and semiempirical QM charges provide very accurate QSPR models and using empirical charges based on electronegativity equalization is also acceptable, as well as advantageous, because their calculation is very fast. On the other hand, Gasteiger-Marsili empirical charges are not applicable for pKa prediction. We later found that QSPR models for some classes of molecules (carboxylic acids) are less accurate. In this context, we compared the influence of different 3D structure sources. We found that an appropriate selection of 3D structure source and optimization method is essential for the successful QSPR modeling of pKa. Specifically, the 3D structures from the DTP NCI and Pubchem databases performed the best, as they provided very accurate QSPR models for all the tested molecular classes and charge calculation approaches, and they do not require optimization. Also, Frog2 performed very well. Other 3D structure sources can also be used but are not so robust, and an unfortunate combination of molecular class and charge calculation approach can produce weak QSPR models. Additionally, these 3D structures generally need optimization in order to produce good quality QSPR models.
- MeSH
- Chemical Phenomena * MeSH
- Quantitative Structure-Activity Relationship * MeSH
- Quantum Theory MeSH
- Molecular Conformation * MeSH
- Models, Molecular * MeSH
- Computer Simulation MeSH
- Drug Design MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Research Support, N.I.H., Extramural MeSH
Proteins are naturally formed by domains edging their functional and structural properties. A domain out of the context of an entire protein can retain its structure and to some extent also function on its own. These properties rationalize construction of artificial fusion multidomain proteins with unique combination of various functions. Information on the specific functional and structural characteristics of individual domains in the context of new artificial fusion proteins is inevitably encoded in sequential order of composing domains defining their mutual spatial positions. So the challenges in designing new proteins with new domain combinations lie dominantly in structure/function prediction and its context dependency. Despite the enormous body of publications on artificial fusion proteins, the task of their structure/function prediction is complex and nontrivial. The degree of spatial freedom facilitated by a linker between domains and their mutual orientation driven by noncovalent interactions is beyond a simple and straightforward methodology to predict their structure with reasonable accuracy. In the presented manuscript, we tested methodology using available modeling tools and computational methods. We show that the process and methodology of such prediction are not straightforward and must be done with care even when recently introduced AlphaFold II is used. We also addressed a question of benchmarking standards for prediction of multidomain protein structures-x-ray or Nuclear Magnetic Resonance experiments. On the study of six two-domain protein chimeras as well as their composing domains and their x-ray structures selected from PDB, we conclude that the major obstacle for justified prediction is inappropriate sampling of the conformational space by the explored methods. On the other hands, we can still address particular steps of the methodology and improve the process of chimera proteins prediction.
- Keywords
- 3D structure prediction, fusion proteins, molecular simulations, x-ray crystallography,
- MeSH
- Protein Domains MeSH
- Proteins * chemistry MeSH
- Recombinant Fusion Proteins * chemistry MeSH
- X-Rays MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Names of Substances
- Proteins * MeSH
- Recombinant Fusion Proteins * MeSH
We describe a novel approach of reaction representation as a combination of two mixtures: a mixture of reactants and a mixture of products. In turn, each mixture can be encoded using an earlier reported approach involving simplex descriptors (SiRMS). The feature vector representing these two mixtures results from either concatenated product and reactant descriptors or the difference between descriptors of products and reactants. This reaction representation doesn't need an explicit labeling of a reaction center. The rigorous "product-out" cross-validation (CV) strategy has been suggested. Unlike the naïve "reaction-out" CV approach based on a random selection of items, the proposed one provides with more realistic estimation of prediction accuracy for reactions resulting in novel products. The new methodology has been applied to model rate constants of E2 reactions. It has been demonstrated that the use of the fragment control domain applicability approach significantly increases prediction accuracy of the models. The models obtained with new "mixture" approach performed better than those required either explicit (Condensed Graph of Reaction) or implicit (reaction fingerprints) reaction center labeling.
- Keywords
- Chemical reactions, Condensed graph of reaction, Mixtures, Rate constant prediction, Reaction fingerprints, Simplex representation of molecular structure,
- MeSH
- Kinetics MeSH
- Quantitative Structure-Activity Relationship MeSH
- Models, Molecular * MeSH
- Molecular Structure MeSH
- Organic Chemicals chemistry MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- Organic Chemicals MeSH
Cardiac arrhythmias are a very frequent illness. Pharmacotherapy is not very effective in persistent arrhythmias and brings along a number of risks. Catheter ablation has became an effective and curative treatment method over the past 20 years. To support complex arrhythmia ablations, the 3D X-ray cardiac cavities imaging is used, most frequently the 3D reconstruction of CT images. The 3D cardiac rotational angiography (3DRA) represents a modern method enabling to create CT like 3D images on a standard X-ray machine equipped with special software. Its advantage lies in the possibility to obtain images during the procedure, decreased radiation dose and reduction of amount of the contrast agent. The left atrium model is the one most frequently used for complex atrial arrhythmia ablations, particularly for atrial fibrillation. CT data allow for creation and segmentation of 3D models of all cardiac cavities. Recently, a research has been made proving the use of 3DRA to create 3D models of other cardiac (right ventricle, left ventricle, aorta) and non-cardiac structures (oesophagus). They can be used during catheter ablation of complex arrhythmias to improve orientation during the construction of 3D electroanatomic maps, directly fused with 3D electroanatomic systems and/or fused with fluoroscopy. An intensive development in the 3D model creation and use has taken place over the past years and they became routinely used during catheter ablations of arrhythmias, mainly atrial fibrillation ablation procedures. Further development may be anticipated in the future in both the creation and use of these models.
- MeSH
- Action Potentials MeSH
- Surgery, Computer-Assisted MeSH
- Radiography, Interventional MeSH
- Catheter Ablation methods MeSH
- Coronary Angiography methods MeSH
- Humans MeSH
- Multidetector Computed Tomography methods MeSH
- Predictive Value of Tests MeSH
- Heart Conduction System diagnostic imaging physiopathology surgery MeSH
- Radiographic Image Interpretation, Computer-Assisted * MeSH
- Software MeSH
- Arrhythmias, Cardiac diagnostic imaging physiopathology surgery MeSH
- Imaging, Three-Dimensional * MeSH
- Animals MeSH
- Check Tag
- Humans MeSH
- Animals MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Review MeSH
To predict unknown reactivation potencies of 12 mono- and bis-pyridinium aldoximes for VX-inhibited rat acetylcholinesterase (rAChE), three-dimensional quantitative structure-activity relationship (3D QSAR) analysis has been carried out. Utilizing molecular interaction fields (MIFs) calculated by molecular mechanical (MMFF94) and quantum chemical (B3LYP/6-31G*) methods, two satisfactory ligand-based CoMFA models have been developed: 1. R(2)=0.9989, Q(LOO)(2)=0.9090, Q(LTO)(2)=0.8921, Q(LMO(20%))(2)=0.8853, R(ext)(2)=0.9259, SDEP(ext)=6.8938; 2. R(2)=0.9962, Q(LOO)(2)=0.9368, Q(LTO)(2)=0.9298, Q(LMO(20%))(2)=0.9248, R(ext)(2)=0.8905, SDEP(ext)=6.6756. High statistical significance of the 3D QSAR models has been achieved through the application of several data noise reduction techniques (i.e. smart region definition SRD, fractional factor design FFD, uninformative/iterative variable elimination UVE/IVE) on the original MIFs. Besides the ligand-based CoMFA models, an alignment molecular set constructed by flexible molecular docking has been also studied. The contour maps as well as the predicted reactivation potencies resulting from 3D QSAR analyses help better understand which structural features are associated with increased reactivation potency of studied compounds.
- Keywords
- 3D QSAR, Acetylcholinesterase, Computational chemistry, Molecular docking, Reactivators, VX,
- MeSH
- Acetylcholinesterase chemistry MeSH
- Enzyme Activation MeSH
- Chemical Warfare Agents chemistry MeSH
- Cholinesterase Inhibitors chemistry MeSH
- GPI-Linked Proteins agonists antagonists & inhibitors chemistry MeSH
- Kinetics MeSH
- Rats MeSH
- Quantitative Structure-Activity Relationship MeSH
- Quantum Theory MeSH
- Ligands MeSH
- Organothiophosphorus Compounds chemistry MeSH
- Oximes chemistry MeSH
- Pyridinium Compounds chemistry MeSH
- Cholinesterase Reactivators chemistry MeSH
- Molecular Dynamics Simulation MeSH
- Molecular Docking Simulation MeSH
- Thermodynamics MeSH
- Animals MeSH
- Check Tag
- Rats MeSH
- Animals MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Names of Substances
- Acetylcholinesterase MeSH
- Ache protein, rat MeSH Browser
- Chemical Warfare Agents MeSH
- Cholinesterase Inhibitors MeSH
- GPI-Linked Proteins MeSH
- Ligands MeSH
- Organothiophosphorus Compounds MeSH
- Oximes MeSH
- Pyridinium Compounds MeSH
- Cholinesterase Reactivators MeSH
- VX MeSH Browser
The problem of designing tablet geometry and its internal structure that results into a specified release profile of the drug during dissolution was considered. A solution method based on parametric programming, inspired by CAD (computer-aided design) approaches currently used in other fields of engineering, was proposed and demonstrated. The solution of the forward problem using a parametric series of structural motifs was first carried out in order to generate a library of drug release profiles associated with each structural motif. The inverse problem was then solved in three steps: first, the combination of basic structural motifs whose superposition provides the closest approximation of the required drug release profile was found by a linear combination of pre-calculated release profiles. In the next step, the final tablet design was constructed and its dissolution curve found computationally. Finally, the proposed design was 3D printed and its dissolution profile was confirmed experimentally. The computational method was based on the numerical solution of drug diffusion in a boundary layer surrounding the tablet, coupled with erosion of the tablet structure encoded by the phase volume function. The tablets were 3D printed by fused deposition modelling (FDM) from filaments produced by hot-melt extrusion. It was found that the drug release profile could be effectively controlled by modifying the tablet porosity. Custom release profiles were obtained by combining multiple porosity regions in the same tablet. The computational method yielded accurate predictions of the drug release rate for both single- and multi-porosity tablets.
- Keywords
- 3D printing, dissolution, hot-melt extrusion, mathematical modelling, parametric programming,
- MeSH
- Printing, Three-Dimensional * MeSH
- Technology, Pharmaceutical methods MeSH
- Porosity MeSH
- Tablets chemistry pharmacokinetics MeSH
- Drug Liberation MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- Tablets MeSH
Proteins are biomolecules with characteristic three-dimensional (3D) arrangements that render them different vital functions. In the last 20 years, there has been a growing interest in biopharmaceutical proteins, especially antibodies, due to their therapeutic application. The functionality of a protein depends on the preservation of its native form, which under certain stressing conditions can undergo changes at different structural levels that cause them to lose their activity.1 Although mass spectrometry is a powerful technique for primary structure determination, it often fails to give information at higher order levels. Like infrared (IR), Raman spectra are well known to contain bands (especially the amide I from 1625-1725cm-1) that correlate with secondary structure (SS) content. However, unlike circular dichroism (CD), the most well-established technique for SS analysis, Raman spectroscopy allows a much wider ranges of optical density, making possible the analysis of highly concentrated samples with no prior dilution. Moreover, water is a weak scatterer below 3000 cm-1, which confers Raman an advantage over IR for the analysis of complex aqueous pharmaceutical samples as the signal from water dominates the amide I region. The most traditional procedure to extract information on SS content is band-fitting. However, in most cases, we found the method to be ambiguous, limited by spectral noise and subjected to the judgment of the analyzer. Self-organizing maps (SOM) is a type of self-learning algorithm that organizes data in a two-dimensional (2D) space based on spectral similarity and class with no bias from the analyzer and very little effect from noise. In this work, a set of protein spectra with known SS content were collected in both solid and aqueous state with back-scatter Raman spectroscopy and used to train a SOM algorithm for SS prediction. The results were compared with those by partial least squares (PLS) regression, band-fitting, and X-ray data in the literature. The prediction errors observed by SOM were comparable to those by PLS and far from those obtained by band-fitting, proving Raman-SOM as viable alternative to the aforementioned methods.
- Keywords
- Raman spectroscopy, Secondary structure of proteins, circular dichroism spectroscopy, infrared spectroscopy, self-organizing maps,
- Publication type
- Journal Article MeSH
A quantitative structure-activity relationship (QSAR) model dependent on log P(n - octanol/water), or log P(OW), was developed with acute toxicity index EC50, the median effective concentration measured as inhibition of movement of the oligochaeta Tubifex tubifex with 3 min exposure, EC50(Tt) (mol/L): log EC50(Tt) = -0.809 (+/-0.035) log P(OW) - 0.495 (+/-0.060), n=82, r=0.931, r2=0.867, residual standard deviation of the estimate 0.315. A learning series for the QSAR model with the oligochaete contained alkanols, alkenols, and alkynols; saturated and unsaturated aldehydes; aniline and chlorinated anilines; phenol and chlorinated phenols; and esters. Three cross-validation procedures proved the robustness and stability of QSAR models with respect to the chemical structure of compounds tested within a series of compounds used in the learning series. Predictive ability was described by q2 .801 (cross-validated r2; predicted variation estimated with cross-validation) in LSO (leave-a structurally series-out) cross-validation.
- MeSH
- Quantitative Structure-Activity Relationship * MeSH
- Oligochaeta drug effects MeSH
- Toxicity Tests, Acute methods MeSH
- Animals MeSH
- Check Tag
- Animals MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
Beta-site APP cleaving enzyme1 (BACE1) catalyzes the rate determining step in the generation of Aβ peptide and is widely considered as a potential therapeutic drug target for Alzheimer's disease (AD). Active site of BACE1 contains catalytic aspartic (Asp) dyad and flap. Asp dyad cleaves the substrate amyloid precursor protein with the help of flap. Currently, there are no marketed drugs available against BACE1 and existing inhibitors are mostly pseudopeptide or synthetic derivatives. There is a need to search for a potent inhibitor with natural scaffold interacting with flap and Asp dyad. This study screens the natural database InterBioScreen, followed by three-dimensional (3D) QSAR pharmacophore modeling, mapping, in silico ADME/T predictions to find the potential BACE1 inhibitors. Further, molecular dynamics of selected inhibitors were performed to observe the dynamic structure of protein after ligand binding. All conformations and the residues of binding region were stable but the flap adopted a closed conformation after binding with the ligand. Bond oligosaccharide interacted with the flap as well as catalytic dyad via hydrogen bond throughout the simulation. This led to stabilize the flap in closed conformation and restricted the entry of substrate. Carbohydrates have been earlier used in the treatment of AD because of their low toxicity, high efficiency, good biocompatibility, and easy permeability through the blood-brain barrier. Our finding will be helpful in identify the potential leads to design novel BACE1 inhibitors for AD therapy.
- Keywords
- 3D QSAR pharmacophore modeling, Alzheimer’s disease, Asp dyad, flap, molecular dynamics, oligosaccharide, virtual screening, β-secretase,
- MeSH
- Algorithms MeSH
- Aspartic Acid Endopeptidases antagonists & inhibitors metabolism MeSH
- Biological Products chemistry pharmacology MeSH
- Enoxaparin pharmacology MeSH
- Heparitin Sulfate pharmacology MeSH
- Inhibitory Concentration 50 MeSH
- Enzyme Inhibitors chemistry pharmacology MeSH
- Crystallography, X-Ray MeSH
- Quantitative Structure-Activity Relationship * MeSH
- Humans MeSH
- Ligands MeSH
- Oligosaccharides chemistry MeSH
- Drug Evaluation, Preclinical * MeSH
- Amyloid Precursor Protein Secretases antagonists & inhibitors metabolism MeSH
- Molecular Dynamics Simulation * MeSH
- Hydrogen Bonding MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Names of Substances
- Aspartic Acid Endopeptidases MeSH
- BACE1 protein, human MeSH Browser
- Biological Products MeSH
- Enoxaparin MeSH
- Heparitin Sulfate MeSH
- Enzyme Inhibitors MeSH
- Ligands MeSH
- Oligosaccharides MeSH
- Amyloid Precursor Protein Secretases MeSH
MOTIVATION: Predicting protein-ligand binding sites is crucial in studying protein interactions with applications in biotechnology and drug discovery. Two distinct paradigms have emerged for this purpose: sequence-based methods, which leverage protein sequence information, and structure-based methods, which rely on the three-dimensional (3D) structure of the protein. Here, we analyze a hybrid approach that combines the strengths of both paradigms by integrating two recent deep learning architectures: protein language models (pLMs) from the sequence-based paradigm and Graph Neural Networks (GNNs) from the structure-based paradigm. Specifically, we construct a residue-level Graph Attention Network (GAT) model based on the protein's 3D structure that uses pre-trained pLM embeddings as node features. This integration enables us to study the interplay between the sequential information encoded in the protein sequence and the spatial relationships within the protein structure on the model performance. RESULTS: By exploiting a benchmark dataset over a range of ligands and ligand types, we have shown that using the structure information consistently enhances the predictive power of the baselines in absolute terms. Nevertheless, as more complex pLMs are used to represent node features, the relative impact of the structure information represented by the GNN architecture diminishes. The above observations suggest that although the use of the experimental protein structure almost always improves the accuracy of the prediction of the binding site, complex pLMs still contain structural information that leads to good predictive performance even without the use of 3D structure. AVAILABILITY: The datasets generated and/or analyzed during the current study, as well as pretrained models are available in the following Zenodo link https://zenodo.org/records/15184302. The source code that was used to generate the results of the current study is available in the following GitHub repository https://github.com/hamzagamouh/pt-lm-gnn as well as in the following Zenodo link https://zenodo.org/records/15192327. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Journal online.
- Publication type
- Journal Article MeSH