Interpretation of QSAR Models: Mining Structural Patterns Taking into Account Molecular Context
Jazyk angličtina Země Německo Médium print-electronic
Typ dokumentu časopisecké články, práce podpořená grantem
PubMed
30346106
DOI
10.1002/minf.201800084
Knihovny.cz E-zdroje
- Klíčová slova
- Gaussian Mixture Modeling, QSAR interpretation, pattern mining,
- MeSH
- antiprotozoální látky chemie toxicita MeSH
- data mining metody MeSH
- kvantitativní vztahy mezi strukturou a aktivitou * MeSH
- racionální návrh léčiv * MeSH
- simulace molekulového dockingu metody MeSH
- software MeSH
- Tetrahymena účinky léků MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- antiprotozoální látky MeSH
The study focused on QSAR model interpretation. The goal was to develop a workflow for the identification of molecular fragments in different contexts important for the property modelled. Using a previously established approach - Structural and physicochemical interpretation of QSAR models (SPCI) - fragment contributions were calculated and their relative influence on the compounds' properties characterised. Analysis of the distributions of these contributions using Gaussian mixture modelling was performed to identify groups of compounds (clusters) comprising the same fragment, where these fragments had substantially different contributions to the property studied. SMARTSminer was used to detect patterns discriminating groups of compounds from each other and visual inspection if the former did not help. The approach was applied to analyse the toxicity, in terms of 40 hour inhibition of growth, of 1984 compounds to Tetrahymena pyriformis. The results showed that the clustering technique correctly identified known toxicophoric patterns: it detected groups of compounds where fragments have specific molecular context making them contribute substantially more to toxicity. The results show the applicability of the interpretation of QSAR models to retrieve reasonable patterns, even from data sets consisting of compounds having different mechanisms of action, something which is difficult to achieve using conventional pattern/data mining approaches.
Citace poskytuje Crossref.org