• Something wrong with this record ?

CYPlebrity: Machine learning models for the prediction of inhibitors of cytochrome P450 enzymes

W. Plonka, C. Stork, M. Šícho, J. Kirchmair

. 2021 ; 46 (-) : 116388. [pub] 20210828

Language English Country Great Britain

Document type Journal Article, Research Support, Non-U.S. Gov't

The vast majority of approved drugs are metabolized by the five major cytochrome P450 (CYP) isozymes, 1A2, 2C9, 2C19, 2D6 and 3A4. Inhibition of CYP isozymes can cause drug-drug interactions with severe pharmacological and toxicological consequences. Computational methods for the fast and reliable prediction of the inhibition of CYP isozymes by small molecules are therefore of high interest and relevance to pharmaceutical companies and a host of other industries, including the cosmetics and agrochemical industries. Today, a large number of machine learning models for predicting the inhibition of the major CYP isozymes by small molecules are available. With this work we aim to go beyond the coverage of existing models, by combining data from several major public and proprietary sources. More specifically, we used up to 18815 compounds with measured bioactivities to train random forest classification models for the individual CYP isozymes. A major advantage of the new data collection over existing ones is the better representation of the minority class, the CYP inhibitors. With the new data collection we achieved inhibitor-to-non-inhibitor ratios in the order of 1:1 (CYP1A2) to 1:3 (CYP2D6). We show that our models reach competitive performance on external data, with Matthews correlation coefficients (MCCs) ranging from 0.62 (CYP2C19) to 0.70 (CYP2D6), and areas under the receiver operating characteristic curve (AUCs) between 0.89 (CYP2C19) and 0.92 (CYPs 2D6 and 3A4). Importantly, the models show a high level of robustness, reflected in a good predictivity also for compounds that are structurally dissimilar to the compounds represented in the training data. The best models presented in this work are freely accessible for academic research via a web service.

References provided by Crossref.org

000      
00000naa a2200000 a 4500
001      
bmc22003723
003      
CZ-PrNML
005      
20220127145933.0
007      
ta
008      
220113s2021 xxk f 000 0|eng||
009      
AR
024    7_
$a 10.1016/j.bmc.2021.116388 $2 doi
035    __
$a (PubMed)34488021
040    __
$a ABA008 $b cze $d ABA008 $e AACR2
041    0_
$a eng
044    __
$a xxk
100    1_
$a Plonka, Wojciech $u Universität Hamburg, Center for Bioinformatics (ZBH), Hamburg, Bundesstr. 43, 20146, Germany; FQS Poland (Fujitsu Group), Parkowa 11, 30-538 Cracow, Poland
245    10
$a CYPlebrity: Machine learning models for the prediction of inhibitors of cytochrome P450 enzymes / $c W. Plonka, C. Stork, M. Šícho, J. Kirchmair
520    9_
$a The vast majority of approved drugs are metabolized by the five major cytochrome P450 (CYP) isozymes, 1A2, 2C9, 2C19, 2D6 and 3A4. Inhibition of CYP isozymes can cause drug-drug interactions with severe pharmacological and toxicological consequences. Computational methods for the fast and reliable prediction of the inhibition of CYP isozymes by small molecules are therefore of high interest and relevance to pharmaceutical companies and a host of other industries, including the cosmetics and agrochemical industries. Today, a large number of machine learning models for predicting the inhibition of the major CYP isozymes by small molecules are available. With this work we aim to go beyond the coverage of existing models, by combining data from several major public and proprietary sources. More specifically, we used up to 18815 compounds with measured bioactivities to train random forest classification models for the individual CYP isozymes. A major advantage of the new data collection over existing ones is the better representation of the minority class, the CYP inhibitors. With the new data collection we achieved inhibitor-to-non-inhibitor ratios in the order of 1:1 (CYP1A2) to 1:3 (CYP2D6). We show that our models reach competitive performance on external data, with Matthews correlation coefficients (MCCs) ranging from 0.62 (CYP2C19) to 0.70 (CYP2D6), and areas under the receiver operating characteristic curve (AUCs) between 0.89 (CYP2C19) and 0.92 (CYPs 2D6 and 3A4). Importantly, the models show a high level of robustness, reflected in a good predictivity also for compounds that are structurally dissimilar to the compounds represented in the training data. The best models presented in this work are freely accessible for academic research via a web service.
650    _2
$a inhibitory cytochromu P450 $x chemická syntéza $x chemie $x farmakologie $7 D065607
650    _2
$a systém (enzymů) cytochromů P-450 $x metabolismus $7 D003577
650    _2
$a vztah mezi dávkou a účinkem léčiva $7 D004305
650    _2
$a lidé $7 D006801
650    12
$a strojové učení $7 D000069550
650    _2
$a molekulární modely $7 D008958
650    _2
$a molekulární struktura $7 D015394
650    _2
$a vztahy mezi strukturou a aktivitou $7 D013329
655    _2
$a časopisecké články $7 D016428
655    _2
$a práce podpořená grantem $7 D013485
700    1_
$a Stork, Conrad $u Universität Hamburg, Center for Bioinformatics (ZBH), Hamburg, Bundesstr. 43, 20146, Germany
700    1_
$a Šícho, Martin $u CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Technická 5, 166 28, Prague, Czech Republic
700    1_
$a Kirchmair, Johannes $u Universität Hamburg, Center for Bioinformatics (ZBH), Hamburg, Bundesstr. 43, 20146, Germany; Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, Faculty of Life Sciences, University of Vienna, Althanstr. 14, 1090 Vienna, Austria. Electronic address: johannes.kirchmair@univie.ac.at
773    0_
$w MED00000769 $t Bioorganic & medicinal chemistry $x 1464-3391 $g Roč. 46, č. - (2021), s. 116388
856    41
$u https://pubmed.ncbi.nlm.nih.gov/34488021 $y Pubmed
910    __
$a ABA008 $b sig $c sign $y p $z 0
990    __
$a 20220113 $b ABA008
991    __
$a 20220127145930 $b ABA008
999    __
$a ok $b bmc $g 1751241 $s 1154872
BAS    __
$a 3
BAS    __
$a PreBMC
BMC    __
$a 2021 $b 46 $c - $d 116388 $e 20210828 $i 1464-3391 $m Bioorganic & medicinal chemistry $n Bioorg Med Chem $x MED00000769
LZP    __
$a Pubmed-20220113

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...