GenUI: interactive and extensible open source software platform for de novo molecular generation and cheminformatics
Status PubMed-not-MEDLINE Language English Country Great Britain, England Media electronic
Document type Journal Article
Grant support
LM2018130
Ministerstvo Školství, Mládeže a Tělovýchovy
RVO 68378050-KAV-NPUI
Ministerstvo Školství, Mládeže a Tělovýchovy
PubMed
34563271
PubMed Central
PMC8465716
DOI
10.1186/s13321-021-00550-y
PII: 10.1186/s13321-021-00550-y
Knihovny.cz E-resources
- Keywords
- De novo drug design, Deep learning, Graphical user interface, Molecule generation, Web application,
- Publication type
- Journal Article MeSH
Many contemporary cheminformatics methods, including computer-aided de novo drug design, hold promise to significantly accelerate and reduce the cost of drug discovery. Thanks to this attractive outlook, the field has thrived and in the past few years has seen an especially significant growth, mainly due to the emergence of novel methods based on deep neural networks. This growth is also apparent in the development of novel de novo drug design methods with many new generative algorithms now available. However, widespread adoption of new generative techniques in the fields like medicinal chemistry or chemical biology is still lagging behind the most recent developments. Upon taking a closer look, this fact is not surprising since in order to successfully integrate the most recent de novo drug design methods in existing processes and pipelines, a close collaboration between diverse groups of experimental and theoretical scientists needs to be established. Therefore, to accelerate the adoption of both modern and traditional de novo molecular generators, we developed Generator User Interface (GenUI), a software platform that makes it possible to integrate molecular generators within a feature-rich graphical user interface that is easy to use by experts of diverse backgrounds. GenUI is implemented as a web service and its interfaces offer access to cheminformatics tools for data preprocessing, model building, molecule generation, and interactive chemical space visualization. Moreover, the platform is easy to extend with customizable frontend React.js components and backend Python extensions. GenUI is open source and a recently developed de novo molecular generator, DrugEx, was integrated as a proof of principle. In this work, we present the architecture and implementation details of GenUI and discuss how it can facilitate collaboration in the disparate communities interested in de novo molecular generation and computer-aided drug discovery.
See more in PubMed
Wang Y, Cheng T, Bryant SH. PubChem BioAssay: a decade’s development toward open high-throughput screening data sharing. SLAS DISCOVERY Adv Sci Drug Discov. 2017;22(6):655–666. doi: 10.1177/2472555216685069. PubMed DOI PMC
Tetko IV, Engkvist O, Koch U, Reymond J-L, Chen H. BIGCHEM: challenges and opportunities for big data analysis in chemistry. Mol Inf. 2016;35(11–12):615–621. doi: 10.1002/minf.201600073. PubMed DOI PMC
Rifaioglu AS, Atas H, Martin MJ, Cetin-Atalay R, Atalay V, Doğan T. Recent applications of deep learning and machine intelligence on in silico drug discovery: methods, tools and databases. Brief Bioinform. 2019;20(5):1878–1912. doi: 10.1093/bib/bby061. PubMed DOI PMC
Hoffmann T, Gastreich M. The next level in chemical space navigation: going far beyond enumerable compound libraries. Drug Discov Today. 2019;24(5):1148–1156. doi: 10.1016/j.drudis.2019.02.013. PubMed DOI
Tetko IV, Engkvist O, Chen H. Does ‘Big Data’ exist in medicinal chemistry, and if so, how can it be harnessed? Future Med Chem. 2016;8(15):1801–1806. doi: 10.4155/fmc-2016-0163. PubMed DOI
Davies M, Nowotka M, Papadatos G, Dedman N, Gaulton A, Atkinson F, Bellis L, Overington JP. ChEMBL web services: streamlining access to drug discovery data and utilities. Nucleic Acids Res. 2015;43(W1):W612–W620. doi: 10.1093/nar/gkv352. PubMed DOI PMC
Mendez D, Gaulton A, Bento AP, Chambers J, De Veij M, Félix E, Magariños María P, Mosquera Juan F, Mutowo P, Nowotka M, et al. ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res. 2019;47(D1):D930–D940. doi: 10.1093/nar/gky1075. PubMed DOI PMC
Polishchuk PG, Madzhidov TI, Varnek A. Estimation of the size of drug-like chemical space based on GDB-17 data. J Comput Aided Mol Des. 2013;27(8):675–679. doi: 10.1007/s10822-013-9672-4. PubMed DOI
Drew KLM, Baiman H, Khwaounjoo P, Yu B, Reynisson J. Size estimation of chemical space: how big is it? J Pharm Pharmacol. 2012;64(4):490–495. doi: 10.1111/j.2042-7158.2011.01424.x. PubMed DOI
Walters WP, Stahl MT, Murcko MA. Virtual screening—an overview. Drug Discov Today. 1998;3(4):160–178. doi: 10.1016/S1359-6446(97)01163-X. DOI
Bohacek RS, McMartin C, Guida WC. The art and practice of structure-based drug design: a molecular modeling perspective. Med Res Rev. 1996;16(1):3–50. doi: 10.1002/(SICI)1098-1128(199601)16:1<3::AID-MED1>3.0.CO;2-6. PubMed DOI
Lenselink EB, ten Dijke N, Bongers B, Papadatos G, van Vlijmen HWT, Kowalczyk W, IJzerman AP, van Westen GJP. Beyond the hype: deep neural networks outperform established methods using a ChEMBL bioactivity benchmark set. J Cheminform. 2017;9(1):45. doi: 10.1186/s13321-017-0232-0. PubMed DOI PMC
Liu X, IJzerman AP, van Westen GJP. Computational approaches for de novo drug design: past, present, and future. In: Cartwright H, editor. Artificial neural networks. New York: Springer; 2021. pp. 139–165. PubMed
Coley CW. Defining and exploring chemical spaces. Trends Chem. 2021;3(2):133–145. doi: 10.1016/j.trechm.2020.11.004. DOI
Opassi G, Gesù A, Massarotti A. The Hitchhiker’s guide to the chemical-biological galaxy. Drug Discov Today. 2018;23(3):565–574. doi: 10.1016/j.drudis.2018.01.007. PubMed DOI
Muratov EN, Bajorath J, Sheridan RP, Tetko IV, Filimonov D, Poroikov V, Oprea TI, Baskin II, Varnek A, Roitberg A, et al. QSAR without borders. Chem Soc Rev. 2020;49(11):3525–3564. doi: 10.1039/D0CS00098A. PubMed DOI PMC
Wang L, Ding J, Pan L, Cao D, Jiang H, Ding X. Artificial intelligence facilitates drug design in the big data era. Chemometr Intell Lab Syst. 2019;194:103850. doi: 10.1016/j.chemolab.2019.103850. DOI
Schneider G, Clark DE. Automated de novo drug design: are we nearly there yet? Angew Chem Int Ed Engl. 2019;58(32):10792–10803. doi: 10.1002/anie.201814681. PubMed DOI
Zhu H. Big data and artificial intelligence modeling for drug discovery. Annu Rev Pharmacol Toxicol. 2020;60(1):573–589. doi: 10.1146/annurev-pharmtox-010919-023324. PubMed DOI PMC
Le TC, Winkler DA. A bright future for evolutionary methods in drug design. ChemMedChem. 2015;10(8):1296–1300. doi: 10.1002/cmdc.201500161. PubMed DOI
Lavecchia A. Deep learning in drug discovery: opportunities, challenges and future prospects. Drug Discov Today. 2019;24(10):2017–2032. doi: 10.1016/j.drudis.2019.07.006. PubMed DOI
Schreiber SL, Kotz JD, Li M, Aubé J, Austin CP, Reed JC, Rosen H, White EL, Sklar LA, Lindsley CW, et al. Advancing biological understanding and therapeutics discovery with small-molecule probes. Cell. 2015;161(6):1252–1265. doi: 10.1016/j.cell.2015.05.023. PubMed DOI PMC
Bian Y, Xie X-Q. Generative chemistry: drug discovery with deep learning generative models. J Mol Model. 2021;27(3):71. doi: 10.1007/s00894-021-04674-8. PubMed DOI PMC
Zheng S, Lei Z, Ai H, Chen H, Deng D, Yang Y. Deep scaffold hopping with multi-modal transformer neural networks. Theor Comput Chem. 2020 doi: 10.26434/chemrxiv.13011767.v1. PubMed DOI PMC
Stojanović L, Popović M, Tijanić N, Rakočević G, Kalinić M. Improved scaffold hopping in ligand-based virtual screening using neural representation learning. J Chem Inf Model. 2020;60(10):4629–4639. doi: 10.1021/acs.jcim.0c00622. PubMed DOI
Baskin II. The power of deep learning to ligand-based novel drug discovery. Expert Opin Drug Discov. 2020;15(7):755–764. doi: 10.1080/17460441.2020.1745183. PubMed DOI
Elton DC, Boukouvalas Z, Fuge MD, Chung PW. Deep learning for molecular design—a review of the state of the art. Mol Syst Des Eng. 2019;4(4):828–849. doi: 10.1039/C9ME00039A. DOI
Xu Y, Lin K, Wang S, Wang L, Cai C, Song C, Lai L, Pei J. Deep learning for molecular generation. Future Med Chem. 2019;11(6):567–597. doi: 10.4155/fmc-2018-0358. PubMed DOI
Jørgensen PB, Schmidt MN, Winther O. Deep generative models for molecular science. Mol Inform. 2018;37(1–2):1700133. doi: 10.1002/minf.201700133. PubMed DOI
Gantzer P, Creton B, Nieto-Draghi C. Inverse-QSPR for de novo design: a review. Mol Inform. 2020;39(4):e1900087. doi: 10.1002/minf.201900087. PubMed DOI
Yoshikawa N, Terayama K, Sumita M, Homma T, Oono K, Tsuda K. Population-based de novo molecule generation, using grammatical evolution. Chem Lett. 2018;47(11):1431–1434. doi: 10.1246/cl.180665. DOI
Jensen JH. A graph-based genetic algorithm and generative model/Monte Carlo tree search for the exploration of chemical space. Chem Sci. 2019;10(12):3567–3572. doi: 10.1039/C8SC05372C. PubMed DOI PMC
Spiegel JO, Durrant JD. AutoGrow4: an open-source genetic algorithm for de novo drug design and lead optimization. J Cheminform. 2020;12(1):25. doi: 10.1186/s13321-020-00429-4. PubMed DOI PMC
Leguy J, Cauchy T, Glavatskikh M, Duval B, Da Mota B. EvoMol: a flexible and interpretable evolutionary algorithm for unbiased de novo molecular generation. J Cheminform. 2020;12(1):55. doi: 10.1186/s13321-020-00458-z. PubMed DOI PMC
Hoksza D, Skoda P, Voršilák M, Svozil D. Molpher: a software framework for systematic chemical space exploration. J Cheminform. 2014;6(1):7. doi: 10.1186/1758-2946-6-7. PubMed DOI PMC
Schneider G, Fechner U. Computer-based de novo design of drug-like molecules. Nat Rev Drug Discov. 2005;4(8):649–663. doi: 10.1038/nrd1799. PubMed DOI
Li X, Xu Y, Yao H, Lin K. Chemical space exploration based on recurrent neural networks: applications in discovering kinase inhibitors. J Cheminform. 2020;12(1):42. doi: 10.1186/s13321-020-00446-3. PubMed DOI PMC
Grisoni F, Neuhaus CS, Hishinuma M, Gabernet G, Hiss JA, Kotera M, Schneider G. De novo design of anticancer peptides by ensemble artificial neural networks. J Mol Model. 2019;25(5):112. doi: 10.1007/s00894-019-4007-6. PubMed DOI
Wu J, Ma Y, Zhou H, Zhou L, Du S, Sun Y, Li W, Dong W, Wang R. Identification of protein tyrosine phosphatase 1B (PTP1B) inhibitors through de novo evoluton, synthesis, biological evaluation and molecular dynamics simulation. Biochem Biophys Res Commun. 2020;526(1):273–280. doi: 10.1016/j.bbrc.2020.03.075. PubMed DOI
Polykovskiy D, Zhebrak A, Vetrov D, Ivanenkov Y, Aladinskiy V, Mamoshina P, Bozdaganyan M, Aliper A, Zhavoronkov A, Kadurin A. Entangled conditional adversarial autoencoder for de novo drug discovery. Mol Pharm. 2018;15(10):4398–4405. doi: 10.1021/acs.molpharmaceut.8b00839. PubMed DOI
Merk D, Friedrich L, Grisoni F, Schneider G. De novo design of bioactive small molecules by artificial intelligence. Mol Inf. 2018;37(1–2):1700153. doi: 10.1002/minf.201700153. PubMed DOI PMC
Putin E, Asadulaev A, Vanhaelen Q, Ivanenkov Y, Aladinskaya AV, Aliper A, Zhavoronkov A. Adversarial threshold neural computer for molecular de novo design. Mol Pharm. 2018;15(10):4386–4397. doi: 10.1021/acs.molpharmaceut.7b01137. PubMed DOI
Sumita M, Yang X, Ishihara S, Tamura R, Tsuda K. Hunting for organic molecules with artificial intelligence: molecules optimized for desired excitation energies. ACS Cent Sci. 2018;4(9):1126–1133. doi: 10.1021/acscentsci.8b00213. PubMed DOI PMC
Zhavoronkov A, Ivanenkov YA, Aliper A, Veselov MS, Aladinskiy VA, Aladinskaya AV, Terentiev VA, Polykovskiy DA, Kuznetsov MD, Asadulaev A, et al. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat Biotechnol. 2019;37(9):1038–1040. doi: 10.1038/s41587-019-0224-x. PubMed DOI
Sparkes A, Aubrey W, Byrne E, Clare A, Khan MN, Liakata M, Markham M, Rowland J, Soldatova LN, Whelan KE, et al. Towards robot scientists for autonomous scientific discovery. Autom Exp. 2010;2:1. doi: 10.1186/1759-4499-2-1. PubMed DOI PMC
Coley CW, Eyke NS, Jensen KF. Autonomous discovery in the chemical sciences part i: progress. Angew Chem Int Ed. 2020;59(51):22858–22893. doi: 10.1002/anie.201909987. PubMed DOI
Coley CW, Eyke NS, Jensen KF. Autonomous discovery in the chemical sciences part II: outlook. Angew Chem Int Ed. 2020;59(52):23414–23436. doi: 10.1002/anie.201909989. PubMed DOI
Grisoni F, Huisman BJH, Button AL, Moret M, Atz K, Merk D, Schneider G. Combining generative artificial intelligence and on-chip synthesis for de novo drug design. Sci Adv. 2021;7(24):eabg3338. doi: 10.1126/sciadv.abg3338. PubMed DOI PMC
Henson AB, Gromski PS, Cronin L. Designing algorithms to aid discovery by chemical robots. ACS Cent Sci. 2018;4(7):793–804. doi: 10.1021/acscentsci.8b00176. PubMed DOI PMC
Dimitrov T, Kreisbeck C, Becker JS, Aspuru-Guzik A, Saikin SK. Autonomous molecular design: then and now. ACS Appl Mater Interfaces. 2019;11(28):24825–24836. doi: 10.1021/acsami.9b01226. PubMed DOI
Schneider G. Automating drug discovery. Nat Rev Drug Discov. 2018;17(2):97–113. doi: 10.1038/nrd.2017.232. PubMed DOI
Willems H, De Cesco S, Svensson F. Computational chemistry on a budget: supporting drug discovery with limited resources. J Med Chem. 2020;63(18):10158–10169. doi: 10.1021/acs.jmedchem.9b02126. PubMed DOI
Chu Y, He X. MoleGear: a java-based platform for evolutionary de novo molecular design. Molecules. 2019;24(7):1444. doi: 10.3390/molecules24071444. PubMed DOI PMC
Douguet D. e-LEA3D: a computational-aided drug design web server. Nucleic Acids Res. 2010;38(suppl_2):W615–W621. doi: 10.1093/nar/gkq322. PubMed DOI PMC
Pastor M, Gómez-Tamayo JC, Sanz F. Flame: an open source framework for model development, hosting, and usage in production environments. J Cheminform. 2021;13(1):31. doi: 10.1186/s13321-021-00509-z. PubMed DOI PMC
Green DVS, Pickett S, Luscombe C, Senger S, Marcus D, Meslamani J, Brett D, Powell A, Masson J. BRADSHAW: a system for automated molecular design. J Comput Aided Mol Des. 2020;34(7):747–765. doi: 10.1007/s10822-019-00234-8. PubMed DOI PMC
Ivanenkov YA, Zhebrak A, Bezrukov D, Zagribelnyy B, Aladinskiy V, Polykovskiy D, Putin E, Kamya P, Aliper A, Zhavoronkov A (2021) Chemistry42: an AI-based platform for de novo molecular design. arXiv preprint arXiv:210109050 PubMed PMC
Zhumagambetov R, Kazbek D, Shakipov M, Maksut D, Peshkov VA, Fazli S. cheML.io: an online database of ML-generated molecules. RSC Adv. 2020;10(73):45189–45198. doi: 10.1039/D0RA07820D. PubMed DOI PMC
Griffen EJ, Dossetter AG, Leach AG. Chemists: AI is here; unite to get the benefits. J Med Chem. 2020;63(16):8695–8704. doi: 10.1021/acs.jmedchem.0c00163. PubMed DOI
Liu X, Ye K, van Vlijmen HWT, IJzerman AP, van Westen GJP. An exploration strategy improves the diversity of de novo ligands using deep reinforcement learning: a case for the adenosine A2A receptor. J Cheminform. 2019;11(1):35. doi: 10.1186/s13321-019-0355-6. PubMed DOI PMC
MIT License. https://opensource.org/licenses/MIT. Accessed 12 Mar 2021
GenUI Frontend Application. By Šícho M. https://github.com/martin-sicho/genui-gui. Accessed 12 Mar 2021
GenUI Backend Application. https://github.com/martin-sicho/genui. Accessed 03 May 2020
Merkel D. Docker: lightweight Linux containers for consistent development and deployment. Linux J. 2014;2014(239):2.
Cito J, Ferme V, Gall HC. Web engineering 2016. Cham: Springer International Publishing; 2016. Using docker containers to improve reproducibility in software and web engineering research; pp. 609–612.
Docker. https://github.com/docker/docker-ce. Accessed 03 May 2020
GenUI Docker Files. By Šícho M. https://github.com/martin-sicho/genui-docker. Accessed 03 May 2020
React: A JavaScript library for building user interfaces. By Facebook I. https://reactjs.org/. Accessed 16 Dec 2020
Vibe: a beautiful react.js dashboard build with Bootstrap 4. By Salas J. https://github.com/NiceDash/Vibe. Accessed 03 May 2020
Tétreault-Pinard ÉO (2019) Plotly JavaScript open source graphing library
Chart.js: simple yet flexible JavaScript charting for designers & developers. https://www.chartjs.org/. Accessed 03 May 2020
ChemSpace JS. https://openscreen.cz/software/chemspace/home/. Accessed 03 May 2020
Schaduangrat N, Lampa S, Simeon S, Gleeson MP, Spjuth O, Nantasenamat C. Towards reproducible computational drug discovery. J Cheminform. 2020;12(1):9. doi: 10.1186/s13321-020-0408-x. PubMed DOI PMC
van der Maaten L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res. 2008;9:2579–2605.
Poličar PG, Stražar M, Zupan B (2019) openTSNE: a modular Python library for t-SNE dimensionality reduction and embedding. bioRxiv, p 731877
GenUI Python Documentation. https://martin-sicho.github.io/genui/docs/index.html. Accessed 12 Mar 2021
Foundation DS (2019) Django (Version 2.2)
Encode OSS L (2019) Django REST Framework
Debian-based images containing PostgreSQL with the RDKit cartridge. https://hub.docker.com/r/informaticsmatters/rdkit-cartridge-debian. Accessed 03 May 2020
RDKit: open-source cheminformatics toolkit. By http://www.rdkit.org/. Accessed 03 May 2020
Django RDKit. https://github.com/rdkit/django-rdkit. Accessed 03 May 2020
Bento AP, Hersey A, Félix E, Landrum G, Gaulton A, Atkinson F, Bellis LJ, De Veij M, Leach AR. An open source chemical structure curation pipeline using RDKit. J Cheminform. 2020;12(1):51. doi: 10.1186/s13321-020-00456-1. PubMed DOI PMC
CELERY: Distributed Task Queue. https://github.com/celery/celery. Accessed 03 May 2020
Redis: in-memory data structure store. By https://github.com/redis/redis. Accessed 03 May 2020
Hunt A, Thomas D. The pragmatic programmer: from journeyman to master. Boston: Addison-Wesley Longman Publishing Co. Inc; 2000.
Celery: get started. https://docs.celeryproject.org/en/stable/getting-started/introduction.html#get-started. Accessed 16 Dec 2020
Docker Hub. https://hub.docker.com/. Accessed 16 Dec 2020
Redis: Docker official images. By https://hub.docker.com/_/redis. Accessed 03 May 2020
NGINX web server. By https://github.com/nginx/nginx. Accessed 03 May 2020
NGINX: official Docker images. By https://hub.docker.com/_/nginx. Accessed 03 May 2020
Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, Li Q, Shoemaker BA, Thiessen PA, Yu B, et al. PubChem 2019 update: improved access to chemical data. Nucleic Acids Res. 2019;47(D1):D1102–D1109. doi: 10.1093/nar/gky1033. PubMed DOI PMC
Irwin JJ, Sterling T, Mysinger MM, Bolstad ES, Coleman RG. ZINC: a free tool to discover chemistry for biology. J Chem Inf Model. 2012;52(7):1757–1768. doi: 10.1021/ci3001277. PubMed DOI PMC
Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, Chang Z, Woolsey J. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 2006;34(suppl_1):D668–D672. doi: 10.1093/nar/gkj067. PubMed DOI PMC
Gilson MK, Liu T, Baitaluk M, Nicola G, Hwang L, Chong J. BindingDB in 2015: a public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Res. 2016;44(D1):D1045–D1053. doi: 10.1093/nar/gkv1072. PubMed DOI PMC
Skuta C, Popr M, Muller T, Jindrich J, Kahle M, Sedlak D, Svozil D, Bartunek P. Probes & drugs portal: an interactive, open data resource for chemical biology. Nat Methods. 2017;14(8):759–760. doi: 10.1038/nmeth.4365. PubMed DOI
IBM RXN for Chemistry. https://rxn.res.ibm.com/. Accessed 12 Mar 2021
PostEra Manifold. https://postera.ai/manifold/. Accessed 12 Mar 2021
QSPRpred: a Flexible Open-Source Quantitative Structure-Property Relationship Modelling Tool
DrugEx: Deep Learning Models and Tools for Exploration of Drug-Like Chemical Space