Simplex representation of molecular structure
Dotaz
Zobrazit nápovědu
We describe a novel approach of reaction representation as a combination of two mixtures: a mixture of reactants and a mixture of products. In turn, each mixture can be encoded using an earlier reported approach involving simplex descriptors (SiRMS). The feature vector representing these two mixtures results from either concatenated product and reactant descriptors or the difference between descriptors of products and reactants. This reaction representation doesn't need an explicit labeling of a reaction center. The rigorous "product-out" cross-validation (CV) strategy has been suggested. Unlike the naïve "reaction-out" CV approach based on a random selection of items, the proposed one provides with more realistic estimation of prediction accuracy for reactions resulting in novel products. The new methodology has been applied to model rate constants of E2 reactions. It has been demonstrated that the use of the fragment control domain applicability approach significantly increases prediction accuracy of the models. The models obtained with new "mixture" approach performed better than those required either explicit (Condensed Graph of Reaction) or implicit (reaction fingerprints) reaction center labeling.
The effect of the structure of organic compounds on the acute toxicity upon oral injection in mice was studied using 2D simplex representation of the molecular structure and Random forest (RF) methods. Satisfactory quantitative structure-activity relationship (QSAR) models were constructed (R2 test = 0,61-0,62). The interpretation of the obtained QSAR models was carried out. The contributions of known toxicophores with established mechanisms of action were calculated in order to confirm the ability of the interpretation approach to correctly rank them relative to other structural fragments. The influence of the molecular surroundings of some toxicophores was analyzed. We analyzed the contributions of other highly ranked fragments from the list of common functional groups and ring systems in order to find new potential toxicophores. The on-line version of the expert system "OCHEM" (https://ochem.eu) and Arithmetic Mean Toxicity (AMT) approach were used for a comparative QSAR study.