Detail
Článek
Článek online
FT
Medvik - BMČ
  • Je něco špatně v tomto záznamu ?

QSPRpred: a Flexible Open-Source Quantitative Structure-Property Relationship Modelling Tool

HW. van den Maagdenberg, M. Šícho, DA. Araripe, S. Luukkonen, L. Schoenmaker, M. Jespers, OJM. Béquignon, MG. González, RL. van den Broek, A. Bernatavicius, JGC. van Hasselt, PH. van der Graaf, GJP. van Westen

. 2024 ; 16 (1) : 128. [pub] 20241114

Status neindexováno Jazyk angličtina Země Anglie, Velká Británie

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/bmc25002139

Grantová podpora
22-17367O Czech Science Foundation Grant
LM2023052 Ministry of Education, Youth and Sports of the Czech Republic
955879 HORIZON EUROPE Marie Sklodowska-Curie Actions
NGFOP2201 Dutch National Growth Fund

Building reliable and robust quantitative structure-property relationship (QSPR) models is a challenging task. First, the experimental data needs to be obtained, analyzed and curated. Second, the number of available methods is continuously growing and evaluating different algorithms and methodologies can be arduous. Finally, the last hurdle that researchers face is to ensure the reproducibility of their models and facilitate their transferability into practice. In this work, we introduce QSPRpred, a toolkit for analysis of bioactivity data sets and QSPR modelling, which attempts to address the aforementioned challenges. QSPRpred's modular Python API enables users to intuitively describe different parts of a modelling workflow using a plethora of pre-implemented components, but also integrates customized implementations in a "plug-and-play" manner. QSPRpred data sets and models are directly serializable, which means they can be readily reproduced and put into operation after training as the models are saved with all required data pre-processing steps to make predictions on new compounds directly from SMILES strings. The general-purpose character of QSPRpred is also demonstrated by inclusion of support for multi-task and proteochemometric modelling. The package is extensively documented and comes with a large collection of tutorials to help new users. In this paper, we describe all of QSPRpred's functionalities and also conduct a small benchmarking case study to illustrate how different components can be leveraged to compare a diverse set of models. QSPRpred is fully open-source and available at https://github.com/CDDLeiden/QSPRpred .Scientific ContributionQSPRpred aims to provide a complex, but comprehensive Python API to conduct all tasks encountered in QSPR modelling from data preparation and analysis to model creation and model deployment. In contrast to similar packages, QSPRpred offers a wider and more exhaustive range of capabilities and integrations with many popular packages that also go beyond QSPR modelling. A significant contribution of QSPRpred is also in its automated and highly standardized serialization scheme, which significantly improves reproducibility and transferability of models.

Citace poskytuje Crossref.org

000      
00000naa a2200000 a 4500
001      
bmc25002139
003      
CZ-PrNML
005      
20250123102015.0
007      
ta
008      
250117s2024 enk f 000 0|eng||
009      
AR
024    7_
$a 10.1186/s13321-024-00908-y $2 doi
035    __
$a (PubMed)39543652
040    __
$a ABA008 $b cze $d ABA008 $e AACR2
041    0_
$a eng
044    __
$a enk
100    1_
$a van den Maagdenberg, Helle W $u Computational Drug Discovery, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden, 2333 CC, The Netherlands $1 https://orcid.org/0000000297187806
245    10
$a QSPRpred: a Flexible Open-Source Quantitative Structure-Property Relationship Modelling Tool / $c HW. van den Maagdenberg, M. Šícho, DA. Araripe, S. Luukkonen, L. Schoenmaker, M. Jespers, OJM. Béquignon, MG. González, RL. van den Broek, A. Bernatavicius, JGC. van Hasselt, PH. van der Graaf, GJP. van Westen
520    9_
$a Building reliable and robust quantitative structure-property relationship (QSPR) models is a challenging task. First, the experimental data needs to be obtained, analyzed and curated. Second, the number of available methods is continuously growing and evaluating different algorithms and methodologies can be arduous. Finally, the last hurdle that researchers face is to ensure the reproducibility of their models and facilitate their transferability into practice. In this work, we introduce QSPRpred, a toolkit for analysis of bioactivity data sets and QSPR modelling, which attempts to address the aforementioned challenges. QSPRpred's modular Python API enables users to intuitively describe different parts of a modelling workflow using a plethora of pre-implemented components, but also integrates customized implementations in a "plug-and-play" manner. QSPRpred data sets and models are directly serializable, which means they can be readily reproduced and put into operation after training as the models are saved with all required data pre-processing steps to make predictions on new compounds directly from SMILES strings. The general-purpose character of QSPRpred is also demonstrated by inclusion of support for multi-task and proteochemometric modelling. The package is extensively documented and comes with a large collection of tutorials to help new users. In this paper, we describe all of QSPRpred's functionalities and also conduct a small benchmarking case study to illustrate how different components can be leveraged to compare a diverse set of models. QSPRpred is fully open-source and available at https://github.com/CDDLeiden/QSPRpred .Scientific ContributionQSPRpred aims to provide a complex, but comprehensive Python API to conduct all tasks encountered in QSPR modelling from data preparation and analysis to model creation and model deployment. In contrast to similar packages, QSPRpred offers a wider and more exhaustive range of capabilities and integrations with many popular packages that also go beyond QSPR modelling. A significant contribution of QSPRpred is also in its automated and highly standardized serialization scheme, which significantly improves reproducibility and transferability of models.
590    __
$a NEINDEXOVÁNO
655    _2
$a časopisecké články $7 D016428
700    1_
$a Šícho, Martin $u Computational Drug Discovery, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden, 2333 CC, The Netherlands $u CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Technická 5, Prague, A-4040, Czech Republic $1 https://orcid.org/0000000287711731 $7 xx0222821
700    1_
$a Araripe, David Alencar $u Computational Drug Discovery, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden, 2333 CC, The Netherlands $u Department of Human Genetics, Leiden University Medical Center, Einthovenweg 20, Leiden, 2333ZC, The Netherlands $1 https://orcid.org/0000000251041959
700    1_
$a Luukkonen, Sohvi $u Computational Drug Discovery, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden, 2333 CC, The Netherlands $u ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University, Altenberger Straße 69, Linz, 610101, Austria $1 https://orcid.org/0000000193871427
700    1_
$a Schoenmaker, Linde $u Computational Drug Discovery, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden, 2333 CC, The Netherlands $1 https://orcid.org/0000000198791004
700    1_
$a Jespers, Michiel $u Computational Drug Discovery, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden, 2333 CC, The Netherlands $1 https://orcid.org/0009000320830159
700    1_
$a Béquignon, Olivier J M $u Computational Drug Discovery, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden, 2333 CC, The Netherlands $u Department of Neurosurgery, Brain Tumor Center Amsterdam, Amsterdam University Medical Center, Cancer Center Amsterdam, De Boelelaan 1117, Amsterdam, 1081 HV, The Netherlands $1 https://orcid.org/0000000275549220
700    1_
$a González, Marina Gorostiola $u Computational Drug Discovery, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden, 2333 CC, The Netherlands $u Oncode Institute, Utrecht, The Netherlands $1 https://orcid.org/0000000315680881
700    1_
$a van den Broek, Remco L $u Computational Drug Discovery, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden, 2333 CC, The Netherlands $1 https://orcid.org/0009000856611157
700    1_
$a Bernatavicius, Andrius $u Computational Drug Discovery, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden, 2333 CC, The Netherlands $u Leiden Institute of Advanced Computer Science, Leiden University, Niels Bohrweg 1, Leiden, 2333 CA, The Netherlands $1 https://orcid.org/0000000200583678
700    1_
$a van Hasselt, J G Coen $u Computational Drug Discovery, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden, 2333 CC, The Netherlands $1 https://orcid.org/0000000216647314
700    1_
$a van der Graaf, Piet H $u Computational Drug Discovery, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden, 2333 CC, The Netherlands $u Certara UK, University Road, Canterbury Innovation Centre, Unit 43, Canterbury, Kent, CT2 7FG, UK $1 https://orcid.org/0000000313143484
700    1_
$a van Westen, Gerard J P $u Computational Drug Discovery, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden, 2333 CC, The Netherlands. gerard@lacdr.leidenuniv.nl $1 https://orcid.org/0000000307171817
773    0_
$w MED00181723 $t Journal of cheminformatics $x 1758-2946 $g Roč. 16, č. 1 (2024), s. 128
856    41
$u https://pubmed.ncbi.nlm.nih.gov/39543652 $y Pubmed
910    __
$a ABA008 $b sig $c sign $y - $z 0
990    __
$a 20250117 $b ABA008
991    __
$a 20250123102009 $b ABA008
999    __
$a ok $b bmc $g 2254483 $s 1238142
BAS    __
$a 3
BAS    __
$a PreBMC-PubMed-not-MEDLINE
BMC    __
$a 2024 $b 16 $c 1 $d 128 $e 20241114 $i 1758-2946 $m Journal of cheminformatics $n J Cheminform $x MED00181723
GRA    __
$a 22-17367O $p Czech Science Foundation Grant
GRA    __
$a LM2023052 $p Ministry of Education, Youth and Sports of the Czech Republic
GRA    __
$a 955879 $p HORIZON EUROPE Marie Sklodowska-Curie Actions
GRA    __
$a NGFOP2201 $p Dutch National Growth Fund
LZP    __
$a Pubmed-20250117

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...