Descriptor based models
Dotaz
Zobrazit nápovědu
Many biochemical reactions are based on redox reactions. Therefore, the redox potential of a chemical compound may be related to its therapeutic or physiological effects. The study of redox properties of compounds is a domain of electrochemistry. The subject of this review is the relationship between electrochemistry and medicinal chemistry, with a focus on quantifying these relationships. A summary of the relevant achievements in the correlation between redox potential and structure, therapeutic activity, resp., is presented. The first part of the review examines the applicability of QSPR for the prediction of redox properties of medically important compounds. The second part brings the exhaustive review of publications using redox potential as a molecular descriptor in QSAR of biological activity. Despite the complexity of medicinal chemistry and biological reactions, it is possible to employ redox potential in QSAR/QSPR. In many cases, this electrochemical parameter plays an essential but rarely absolute role.
- MeSH
- antibakteriální látky chemie MeSH
- antioxidancia chemie MeSH
- antiprotozoální látky chemie MeSH
- antivirové látky chemie MeSH
- farmaceutická chemie * MeSH
- kvantitativní vztahy mezi strukturou a aktivitou * MeSH
- oxidace-redukce MeSH
- protinádorové látky chemie MeSH
- transport elektronů MeSH
- Publikační typ
- časopisecké články MeSH
- přehledy MeSH
The large amounts of ammonia emissions generated from industrial production have caused serious environmental pollution problems, such as soil acidification, eutrophication, the formation of fine particles and changes in the global greenhouse balance, and also greatly endanger human health. At present, effectively reducing ammonia emissions or recovering ammonia is still a huge challenge. Ionic liquids (ILs) as a new class of green solvent have been introduced for ammonia absorption with great potential, but a huge number on combination systems of ILs lead to the difficulty of measuring the ammonia solubility in all ILs by experiments (e.g., danger and cost). Hereby, this study proposed a novel approach for estimating the ammonia solubility in different ILs. A predictive model was developed based on the novel Algorithm - extreme learning machine (ELM) and the molecular descriptors of electrostatic potential surface areas (SEP) as input parameters. Besides, 502 data points of ammonia solubility in 17 ILs were gathered with a wide range of pressure and temperature. For the total set, the determination coefficient (R2) and the average absolute relative deviation (AARD) of the developed model were 0.9937 and 2.95%, respectively. The regression plots revealed good consistency between predictive and experimental data points. Results show the good performance and reliability of the developed model, indicating that the proposed approach can be potentially applied for screening reasonable ILs to absorb ammonia from chemical industry processes.
- MeSH
- amoniak MeSH
- iontové kapaliny * MeSH
- lidé MeSH
- reprodukovatelnost výsledků MeSH
- rozpouštědla MeSH
- teplota MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
We describe a novel approach of reaction representation as a combination of two mixtures: a mixture of reactants and a mixture of products. In turn, each mixture can be encoded using an earlier reported approach involving simplex descriptors (SiRMS). The feature vector representing these two mixtures results from either concatenated product and reactant descriptors or the difference between descriptors of products and reactants. This reaction representation doesn't need an explicit labeling of a reaction center. The rigorous "product-out" cross-validation (CV) strategy has been suggested. Unlike the naïve "reaction-out" CV approach based on a random selection of items, the proposed one provides with more realistic estimation of prediction accuracy for reactions resulting in novel products. The new methodology has been applied to model rate constants of E2 reactions. It has been demonstrated that the use of the fragment control domain applicability approach significantly increases prediction accuracy of the models. The models obtained with new "mixture" approach performed better than those required either explicit (Condensed Graph of Reaction) or implicit (reaction fingerprints) reaction center labeling.
Co-milling is an effective technique for improving dissolution rate limited absorption characteristics of poorly water-soluble drugs. However, there is a scarcity of models available to forecast the magnitude of dissolution rate improvement caused by co-milling. Therefore, this study endeavoured to quantitatively predict the increase in dissolution by co-milling based on drug properties. Using a biorelevant dissolution setup, a series of 29 structurally diverse and crystalline drugs were screened in co-milled and physically blended mixtures with Polyvinylpyrrolidone K25. Co-Milling Dissolution Ratios after 15 min (COMDR15 min) and 60 min (COMDR60 min) drug release were predicted by variable selection in the framework of a partial least squares (PLS) regression. The model forecasts the COMDR15 min (R2 = 0.82 and Q2 = 0.77) and COMDR60 min (R2 = 0.87 and Q2 = 0.84) with small differences in root mean square errors of training and test sets by selecting four drug properties. Based on three of these selected variables, applicable multiple linear regression equations were developed with a high predictive power of R2 = 0.83 (COMDR15 min) and R2 = 0.84 (COMDR60 min). The most influential predictor variable was the median drug particle size before milling, followed by the calculated drug logD6.5 value, the calculated molecular descriptor Kappa 3 and the apparent solubility of drugs after 24 h dissolution. The study demonstrates the feasibility of forecasting the dissolution rate improvements of poorly water-solube drugs through co-milling. These models can be applied as computational tools to guide formulation in early stage development.
Modern QSAR approaches have wide practical applications in drug discovery for designing potentially bioactive molecules. If such models are based on the use of 2D descriptors, important information contained in the spatial structures of molecules is lost. The major problem in constructing models using 3D descriptors is the choice of a putative bioactive conformation, which affects the predictive performance. The multi-instance (MI) learning approach considering multiple conformations in model training could be a reasonable solution to the above problem. In this study, we implemented several multi-instance algorithms, both conventional and based on deep learning, and investigated their performance. We compared the performance of MI-QSAR models with those based on the classical single-instance QSAR (SI-QSAR) approach in which each molecule is encoded by either 2D descriptors computed for the corresponding molecular graph or 3D descriptors issued for a single lowest energy conformation. The calculations were carried out on 175 data sets extracted from the ChEMBL23 database. It is demonstrated that (i) MI-QSAR outperforms SI-QSAR in numerous cases and (ii) MI algorithms can automatically identify plausible bioactive conformations.
Motivation: Whole genome expression profiling of large cohorts of different types of cancer led to the identification of distinct molecular subcategories (subtypes) that may partially explain the observed inter-tumoral heterogeneity. This is also the case of colorectal cancer (CRC) where several such categorizations have been proposed. Despite recent developments, the problem of subtype definition and recognition remains open, one of the causes being the intrinsic heterogeneity of each tumor, which is difficult to estimate from gene expression profiles. However, one of the observations of these studies indicates that there may be links between the dominant tumor morphology characteristics and the molecular subtypes. Benefiting from a large collection of CRC samples, comprising both gene expression and histopathology images, we investigated the possibility of building image-based classifiers able to predict the molecular subtypes. We employed deep convolutional neural networks for extracting local descriptors which were then used for constructing a dictionary-based representation of each tumor sample. A set of support vector machine classifiers were trained to solve different binary decision problems, their combined outputs being used to predict one of the five molecular subtypes. Results: A hierarchical decomposition of the multi-class problem was obtained with an overall accuracy of 0.84 (95%CI=0.79-0.88). The predictions from the image-based classifier showed significant prognostic value similar to their molecular counterparts. Contact: popovici@iba.muni.cz. Availability and Implementation: Source code used for the image analysis is freely available from https://github.com/higex/qpath . Supplementary information: Supplementary data are available at Bioinformatics online.
- MeSH
- kolorektální nádory diagnóza genetika metabolismus patologie MeSH
- lidé MeSH
- nádorové biomarkery * MeSH
- neuronové sítě * MeSH
- počítačové zpracování obrazu metody MeSH
- prognóza MeSH
- regulace genové exprese u nádorů MeSH
- support vector machine MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
Optimal descriptors calculated with Simplified Molecular Input Line Entry System (SMILES) notation have been used in quantitative structure-property relationships (QSPR) of half-wave potential of N-benzylsalicylthioamides. The QSPR developed is one-variable model based on the optimal descriptors calculated with the Monte Carlo method. The approach has been checked up with three random splits into the training and test sets. Mechanistic interpretations (structural alerts related to the half-wave potential) of the model are discussed.
The enantiomers of trans-paroxetine (the selectand) were separated on four chiral stationary phases incorporating either quinine [ZWIX(+), ZWIX(+A)] or quinidine [ZWIX(-), ZWIX(-A)] and (R,R)-aminocyclohexanesulfonic acid [in ZWIX(-), and ZWIX(+A)] or (S,S)-aminocyclohexanesulfonic acid [in ZWIX(+), and ZWIX(-A)] chiral selectors. The zwitterion nature of the phases is due to the presence of either (R,R)- or (S,S)-aminocyclohexanesulfonic acid in the selector structure bearing the quinuclidine moiety. ZWIX(+) and ZWIX(-) phases are available on the market with the commercial names CHIRALPAK ZWIX(+) and CHIRALPAK ZWIX(-), respectively. With the aim of rationalizing the enantiomer elution order with the above chiral stationary phases, a molecular dynamic protocol was applied and two energetic parameters were initially measured: selectand conformational energy and selectand interaction energy. In the search for other descriptors allowing a better fitting with the experimental evidences, in the present work we consider an energetic parameter, defined as the selector conformational energy, which resulted to be relevant in the explanation of the experimental elution order in most of the cases. Very importantly, the computational data produced by the present study strongly support the outstanding role of the conformational energy of the chiral selector as it interacts with the analytes.
Mesoporous silica has emerged as a promising component in bio-enabling formulation strategy. However, there is currently a lack of predictive tools for assessing drug-silica interactions in a preformulation phase, when formulators only have minimal material to guide them. This study proposes a solution: a chromatographic method to rank apparent drug-silica affinity for mesoporous formulations. Using a dataset of 52 drugs, a hydrophilic liquid interaction chromatography (HILIC) screening method was developed, with a stationary silica phase to simulate the drug carrier. Molecular descriptors were calculated for various compounds to analyze HILIC retention times using a tree-based machine learning algorithm. For silica affinity, the distribution coefficient (LogD), the molecular shape descriptor Kappa1, and the number of conjugated bonds (NCB) were identified as possible critical parameters. Additionally, an amine-modified HILIC column was evaluated to simulate a surface-modified silica carrier. The classification tree analysis revealed that Abraham's hydrogen bonding acidity, the NCB and the pKa were determinants for a qualitative assessment of drug affinity to the modified silica. The classification into low, moderate, and high affinity to the stationary phase appeared to be useful in understanding drug release from mesoporous silica formulations, highlighting its potential for future research.
BACKGROUND: Although the etiology of chronic lymphocytic leukemia (CLL), the most common type of adult leukemia, is still unclear, strong evidence implicates antigen involvement in disease ontogeny and evolution. Primary and 3D structure analysis has been utilised in order to discover indications of antigenic pressure. The latter has been mostly based on the 3D models of the clonotypic B cell receptor immunoglobulin (BcR IG) amino acid sequences. Therefore, their accuracy is directly dependent on the quality of the model construction algorithms and the specific methods used to compare the ensuing models. Thus far, reliable and robust methods that can group the IG 3D models based on their structural characteristics are missing. RESULTS: Here we propose a novel method for clustering a set of proteins based on their 3D structure focusing on 3D structures of BcR IG from a large series of patients with CLL. The method combines techniques from the areas of bioinformatics, 3D object recognition and machine learning. The clustering procedure is based on the extraction of 3D descriptors, encoding various properties of the local and global geometrical structure of the proteins. The descriptors are extracted from aligned pairs of proteins. A combination of individual 3D descriptors is also used as an additional method. The comparison of the automatically generated clusters to manual annotation by experts shows an increased accuracy when using the 3D descriptors compared to plain bioinformatics-based comparison. The accuracy is increased even more when using the combination of 3D descriptors. CONCLUSIONS: The experimental results verify that the use of 3D descriptors commonly used for 3D object recognition can be effectively applied to distinguishing structural differences of proteins. The proposed approach can be applied to provide hints for the existence of structural groups in a large set of unannotated BcR IG protein files in both CLL and, by logical extension, other contexts where it is relevant to characterize BcR IG structural similarity. The method does not present any limitations in application and can be extended to other types of proteins.