• This record comes from PubMed

Molpher: a software framework for systematic chemical space exploration

. 2014 Mar 21 ; 6 (1) : 7. [epub] 20140321

Status PubMed-not-MEDLINE Language English Country England, Great Britain Media electronic

Document type Journal Article

BACKGROUND: Chemical space is virtual space occupied by all chemically meaningful organic compounds. It is an important concept in contemporary chemoinformatics research, and its systematic exploration is vital to the discovery of either novel drugs or new tools for chemical biology. RESULTS: In this paper, we describe Molpher, an open-source framework for the systematic exploration of chemical space. Through a process we term 'molecular morphing', Molpher produces a path of structurally-related compounds. This path is generated by the iterative application of so-called 'morphing operators' that represent simple structural changes, such as the addition or removal of an atom or a bond. Molpher incorporates an optimized parallel exploration algorithm, compound logging and a two-dimensional visualization of the exploration process. Its feature set can be easily extended by implementing additional morphing operators, chemical fingerprints, similarity measures and visualization methods. Molpher not only offers an intuitive graphical user interface, but also can be run in batch mode. This enables users to easily incorporate molecular morphing into their existing drug discovery pipelines. CONCLUSIONS: Molpher is an open-source software framework for the design of virtual chemical libraries focused on a particular mechanistic class of compounds. These libraries, represented by a morphing path and its surroundings, provide valuable starting data for future in silico and in vitro experiments. Molpher is highly extensible and can be easily incorporated into any existing computational drug design pipeline.

See more in PubMed

Bohacek RS, McMartin C, Guida WC. The art and practice of structure-based drug design: a molecular modeling perspective. Med Res Rev. 1996;16(1):3–50. doi: 10.1002/(SICI)1098-1128(199601)16:1<3::AID-MED1>3.0.CO;2-6. PubMed DOI

Dobson CM. Chemical space and biology. Nature. 2004;432(7019):824–828. doi: 10.1038/nature03192. PubMed DOI

Reymond JL, Ruddigkeit L, Blum L, van Deursen R. The enumeration of chemical space. Wires Comput Mol Sci. 2012;2(5):717–733. doi: 10.1002/wcms.1104. DOI

Medina-Franco JL, Martinez-Mayorga K, Meurice N. Balancing novelty with confined chemical space in modern drug discovery. Expert Opin Drug Discov. 2014;9(2):151–165. doi: 10.1517/17460441.2014.872624. PubMed DOI

Nisius B, Bajorath J. Mapping of pharmacological space. Expert Opin Drug Discov. 2011;6(1):1–7. doi: 10.1517/17460441.2011.533654. PubMed DOI

Stockwell BR. Exploring biology with small organic molecules. Nature. 2004;432(7019):846–854. doi: 10.1038/nature03196. PubMed DOI PMC

Schreiber SL. Small molecules: the missing link in the central dogma. Nat Chem Biol. 2005;1(2):64–66. doi: 10.1038/nchembio0705-64. PubMed DOI

Polishchuk PG, Madzhidov TI, Varnek A. Estimation of the size of drug-like chemical space based on GDB-17 data. J Comput Aided Mol Des. 2013;27(8):675–679. doi: 10.1007/s10822-013-9672-4. PubMed DOI

Ertl P. Cheminformatics analysis of organic substituents: identification of the most common substituents, calculation of substituent properties, and automatic identification of drug-like bioisosteric groups. J Chem Inform Comput Sci. 2003;43(2):374–380. doi: 10.1021/ci0255782. PubMed DOI

Walters WP, Stahl MT, Murcko MA. Virtual screening - an overview. Drug Discov Today. 1998;3(4):160–178. doi: 10.1016/S1359-6446(97)01163-X. DOI

Drew KL, Baiman H, Khwaounjoo P, Yu B, Reynisson J. Size estimation of chemical space: how big is it? J Pharm Pharmacol. 2012;64(4):490–495. doi: 10.1111/j.2042-7158.2011.01424.x. PubMed DOI

Ogata K, Isomura T, Yamashita H, Kubodera H. A quantitative approach to the estimation of chemical space from a given geometry by the combination of atomic species. Qsar Comb Sci. 2007;26(5):596–607. doi: 10.1002/qsar.200630037. DOI

Fink T, Bruggesser H, Reymond JL. Virtual exploration of the small-molecule chemical universe below 160 Daltons. Angew Chem. 2005;44(10):1504–1508. doi: 10.1002/anie.200462457. PubMed DOI

Fink T, Reymond JL. Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes, and drug discovery. J Chem Inform Model. 2007;47(2):342–353. doi: 10.1021/ci600423u. PubMed DOI

Blum LC, Reymond JL. 970 million druglike small molecules for virtual screening in the chemical universe database GDB-13. J Am Chem Soc. 2009;131(25):8732–8733. doi: 10.1021/ja902302h. PubMed DOI

Ruddigkeit L, van Deursen R, Blum LC, Reymond JL. Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J Chem Inform Model. 2012;52(11):2864–2875. doi: 10.1021/ci300415d. PubMed DOI

PubChem Compound Database. http://www.ncbi.nlm.nih.gov/pccompound?term=all[filt]&cmd=search.

Chemical Abstracts Service. http://www.cas.org/

Singh N, Guha R, Giulianotti MA, Pinilla C, Houghten RA, Medina-Franco JL. Chemoinformatic analysis of combinatorial libraries, drugs, natural products, and molecular libraries small molecule repository. J Chem Inform Model. 2009;49(4):1010–1024. doi: 10.1021/ci800426u. PubMed DOI PMC

Medina-Franco JL, Martinez-Mayorga K, Bender A, Marin RM, Giulianotti MA, Pinilla C, Houghten RA. Characterization of activity landscapes using 2D and 3D similarity methods: consensus activity cliffs. J Chem Inform Model. 2009;49(2):477–491. doi: 10.1021/ci800379q. PubMed DOI

Todeschini R, Consonni V. Handbook of Molecular Descriptors, vol. 11. Weinheim, Germany: Wiley-VCH; 2002.

Shanmugasundaram V, Maggiora GM, Lajiness MS. Hit-directed nearest-neighbor searching. J Med Chem. 2005;48(1):240–248. doi: 10.1021/jm0493515. PubMed DOI

Sheridan RP, Kearsley SK. Why do we need so many chemical similarity search methods? Drug Discov Today. 2002;7(17):903–911. doi: 10.1016/S1359-6446(02)02411-X. PubMed DOI

Willett P. Similarity-based virtual screening using 2D fingerprints. Drug Discov Today. 2006;11(23–24):1046–1053. PubMed

Geppert H, Vogt M, Bajorath J. Current trends in ligand-based virtual screening: molecular representations, data mining methods, new application areas, and performance evaluation. J Chem Inform Model. 2010;50(2):205–216. doi: 10.1021/ci900419k. PubMed DOI

Varnek A, Baskin II. Chemoinformatics as a theoretical chemistry discipline. Mol Inform. 2011;30(1):20–32. doi: 10.1002/minf.201000100. PubMed DOI

Ivanenkov YA, Savchuk NP, Ekins S, Balakin KV. Computational mapping tools for drug discovery. Drug Discov Today. 2009;14(15–16):767–775. PubMed

Jolliffe IT. Principal Component Analysis. Heidleberg, Germany: Springer; 2010.

Schiffman SS, Lance Reynolds M, Young FW. Introduction to Multidimensional Scaling: Theory, Methods, and Applications. Bingley, United Kingdom: Emerald Group Publishing Limited; 1981.

Oprea TI, Gottfries J. Chemography: the art of navigating in chemical space. J Combin Chem. 2001;3(2):157–166. doi: 10.1021/cc0000388. PubMed DOI

Le Guilloux V, Colliandre L, Bourg S, Guenegou G, Dubois-Chevalier J, Morin-Allory L. Visual characterization and diversity quantification of chemical libraries: 1. creation of delimited reference chemical subspaces. J Chem Inform Model. 2011;51(8):1762–1774. doi: 10.1021/ci200051r. PubMed DOI

Colliandre L, Le Guilloux V, Bourg S, Morin-Allory L. Visual characterization and diversity quantification of chemical libraries: 2. Analysis and selection of size-independent, subspace-specific diversity indices. J Chem Inform Model. 2012;52(2):327–342. doi: 10.1021/ci200535y. PubMed DOI

Bemis GW, Murcko MA. The properties of known drugs. 1. Molecular frameworks. J Med Chem. 1996;39(15):2887–2893. doi: 10.1021/jm9602928. PubMed DOI

Schuffenhauer A, Ertl P, Roggo S, Wetzel S, Koch MA, Waldmann H. The scaffold tree–visualization of the scaffold universe by hierarchical scaffold classification. J Chem Inform Model. 2007;47(1):47–58. doi: 10.1021/ci600338x. PubMed DOI

Koch MA, Schuffenhauer A, Scheck M, Wetzel S, Casaulta M, Odermatt A, Ertl P, Waldmann H. Charting biologically relevant chemical space: a structural classification of natural products (SCONP) Proc Natl Acad Sci U S A. 2005;102(48):17272–17277. doi: 10.1073/pnas.0503647102. PubMed DOI PMC

Renner S, van Otterlo WA, Dominguez Seoane M, Mocklinghoff S, Hofmann B, Wetzel S, Schuffenhauer A, Ertl P, Oprea TI, Steinhilber D, Brunsveld L, Rauh D, Waldmann H. Bioactivity-guided mapping and navigation of chemical space. Nat Chem Biol. 2009;5(8):585–592. doi: 10.1038/nchembio.188. PubMed DOI

Wetzel S, Klein K, Renner S, Rauh D, Oprea TI, Mutzel P, Waldmann H. Interactive exploration of chemical space with Scaffold Hunter. Nat Chem Biol. 2009;5(8):581–583. doi: 10.1038/nchembio.187. PubMed DOI

Xu YJ, Johnson M. Using molecular equivalence numbers to visually explore structural features that distinguish chemical libraries. J Chem Inform Comput Sci. 2002;42(4):912–926. doi: 10.1021/ci025535l. PubMed DOI

Medina-Franco JL, Petit J, Maggiora GM. Hierarchical strategy for identifying active chemotype classes in compound databases. Chem Biol Drug Des. 2006;67(6):395–408. doi: 10.1111/j.1747-0285.2006.00397.x. PubMed DOI

Wilkens SJ, Janes J, Su AI. HierS: hierarchical scaffold clustering using topological chemical graphs. J Med Chem. 2005;48(9):3182–3193. doi: 10.1021/jm049032d. PubMed DOI

Schneider G, Fechner U. Computer-based de novo design of drug-like molecules. Nat Rev Drug Discov. 2005;4(8):649–663. doi: 10.1038/nrd1799. PubMed DOI

Kutchukian PS, Lou D, Shakhnovich EI. FOG: Fragment Optimized Growth algorithm for the de novo generation of molecules occupying druglike chemical space. J Chem Inform Model. 2009;49(7):1630–1642. doi: 10.1021/ci9000458. PubMed DOI

Miranker A, Karplus M. An automated method for dynamic ligand design. Proteins. 1995;23(4):472–490. doi: 10.1002/prot.340230403. PubMed DOI

Loving K, Alberts I, Sherman W. Computational approaches for fragment-based and de novo design. Curr Top Med Chem. 2010;10(1):14–32. doi: 10.2174/156802610790232305. PubMed DOI

Schneider G, Hartenfeller M, Reutlinger M, Tanrikulu Y, Proschak E, Schneider P. Voyages to the (un)known: adaptive design of bioactive compounds. Trends Biotechnol. 2009;27(1):18–26. doi: 10.1016/j.tibtech.2008.09.005. PubMed DOI

van Deursen R, Reymond JL. Chemical space travel. ChemMedChem. 2007;2(5):636–640. doi: 10.1002/cmdc.200700021. PubMed DOI

Brown N, McKay B, Gilardoni F, Gasteiger J. A graph-based genetic algorithm and its application to the multiobjective evolution of median molecules. J Chem Inform Comput Sci. 2004;44(3):1079–1087. doi: 10.1021/ci034290p. PubMed DOI

Brown N, McKay B, Gasteiger J. The de novo design of median molecules within a property range of interest. J Comput Aided Mol Des. 2004;18(12):761–771. doi: 10.1007/s10822-004-6986-2. PubMed DOI

Lameijer EW, Kok JN, Back T, Ijzerman AP. The molecule evoluator. An interactive evolutionary algorithm for the design of drug-like molecules. J Chem Inform Model. 2006;46(2):545–552. doi: 10.1021/ci050369d. PubMed DOI

Bishop KJ, Klajn R, Grzybowski BA. The core and most useful molecules in organic chemistry. Angew Chem. 2006;45(32):5348–5354. doi: 10.1002/anie.200600881. PubMed DOI

Yu MJ. Natural product-like virtual libraries: recursive atom-based enumeration. J Chem Inform Model. 2011;51(3):541–557. doi: 10.1021/ci1002087. PubMed DOI

Yu MJ. Druggable chemical space and enumerative combinatorics. Journal of cheminformatics. 2013;5(1):19. doi: 10.1186/1758-2946-5-19. PubMed DOI PMC

Virshup AM, Contreras-Garcia J, Wipf P, Yang W, Beratan DN. Stochastic voyages into uncharted chemical space produce a representative library of all possible drug-like compounds. J Am Chem Soc. 2013;135(19):7296–7303. doi: 10.1021/ja401184g. PubMed DOI PMC

Hoksza D, Svozil D. IEEE 11th International Conference on Bioinformatics and Bioengineering (BIBE) Taichung, Taiwan: IEEE; 2011. IEEE 11th International Conference on Bioinformatics and Bioengineering; pp. 201–208.

Schäling B. The Boost C++ Libraries. Laguna Hills, CA, U.S.A: XML Press; 2011.

RDKit: Cheminformatics and Machine Learning Software. http://www.rdkit.org/

Reinders J. Intel Threading Building Blocks: Outfitting C++ for Multi-Core Processor Parallelism. Sebastopol, CA, U.S.A: O'Reilly Media; 2007.

Ertl P, Schuffenhauer A. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. Journal of cheminformatics. 2009;1(1):8. doi: 10.1186/1758-2946-1-8. PubMed DOI PMC

Qt. http://qt.digia.com/

Molpher User Manual. https://www.assembla.com/spaces/molpher/wiki/User_Manual.

Rogers D, Hahn M. Extended-connectivity fingerprints. J Chem Inform Model. 2010;50(5):742–754. doi: 10.1021/ci100050t. PubMed DOI

Medina-Franco JL, Martinez-Mayorga K, Giulianotti MA, Houghten RA, Pinilla C. Visualization of the chemical space in drug discovery. Curr Comput-Aid Drug. 2008;4(4):322–333. doi: 10.2174/157340908786786010. DOI

Ma S, Dai Y. Principal component analysis based methods in bioinformatics studies. Brief Bioinform. 2011;12(6):714–722. doi: 10.1093/bib/bbq090. PubMed DOI PMC

Dibattista G, Eades P, Tamassia R, Tollis IG. Algorithms for Drawing Graphs - an Annotated-Bibliography. Comp Geom-Theor Appl. 1994;4(5):235–282. doi: 10.1016/0925-7721(94)00014-X. DOI

Kamada T, Kawai S. An algorithm for drawing general undirected graphs. Inform Process Lett. 1989;31(1):7–15. doi: 10.1016/0020-0190(89)90102-6. DOI

GGA Software Services - Indigo Toolkit. http://www.ggasoftware.com/opensource/indigo.

ChemAxon Marvin. http://www.chemaxon.com/products/marvin/

Wang Y, Xiao J, Suzek TO, Zhang J, Wang J, Bryant SH. PubChem: a public information system for analyzing bioactivities of small molecules. Nucleic Acids Res. 2009;37(Web Server issue):W623–633. PubMed PMC

Irwin JJ, Sterling T, Mysinger MM, Bolstad ES, Coleman RG. ZINC: a free tool to discover chemistry for biology. J Chem Inform Model. 2012;52(7):1757–1768. doi: 10.1021/ci3001277. PubMed DOI PMC

Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington JP. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012;40(Database issue):D1100–1107. PubMed PMC

Daylight Theory: SMILES. http://www.daylight.com/dayhtml/doc/theory/theory.smiles.html.

Daylight Theory: SMARTS. http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html.

PubChem Substructure Fingerprint. ftp://ftp.ncbi.nlm.nih.gov/pubchem/specifications/pubchem_fingerprints.pdf.

O'Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR. Open Babel: an open chemical toolbox. Journal of cheminformatics. 2011;3:33. doi: 10.1186/1758-2946-3-33. PubMed DOI PMC

Baell JB, Holloway GA. New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. J Med Chem. 2010;53(7):2719–2740. doi: 10.1021/jm901137j. PubMed DOI

Nicolaou CA, Brown N, Pattichis CS. Molecular optimization using computational multi-objective methods. Curr Opin Drug Discov Dev. 2007;10(3):316–324. PubMed

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...