Scaffold analysis of PubChem database as background for hierarchical scaffold-based visualization
Status PubMed-not-MEDLINE Language English Country Great Britain, England Media electronic-ecollection
Document type Journal Article
PubMed
28090217
PubMed Central
PMC5199768
DOI
10.1186/s13321-016-0186-7
PII: 186
Knihovny.cz E-resources
- Keywords
- Chemical space, Pubchem, Scaffold, Treemap, Visualization,
- Publication type
- Journal Article MeSH
BACKGROUND: Visualization of large molecular datasets is a challenging yet important topic utilised in diverse fields of chemistry ranging from material engineering to drug design. Especially in drug design, modern methods of high-throughput screening generate large amounts of molecular data that call for methods enabling their analysis. One such method is classification of compounds based on their molecular scaffolds, a concept widely used by medicinal chemists to group molecules of similar properties. This classification can then be utilized for intuitive visualization of compounds. RESULTS: In this paper, we propose a scaffold hierarchy as a result of large-scale analysis of the PubChem Compound database. The analysis not only provided insights into scaffold diversity of the PubChem Compound database, but also enables scaffold-based hierarchical visualization of user compound data sets on the background of empirical chemical space, as defined by the PubChem data, or on the background of any other user-defined data set. The visualization is performed by a web based client-server application called Scaffvis. It provides an interactive zoomable tree map visualization of data sets up to hundreds of thousands molecules. Scaffvis is free to use and its source codes have been published under an open source license.Graphical abstract.
See more in PubMed
Oprea T, Gottfries J. Chemography: the art of navigating in chemical space. J Comb Chem. 2001;3(2):157–166. doi: 10.1021/cc0000388. PubMed DOI
Nguyen KT, Blum LC, van Deursen R, Reymond J-L. Classification of organic molecules by molecular quantum numbers. ChemMedChem. 2009;4(11):1803–1805. doi: 10.1002/cmdc.200900317. PubMed DOI
Owen JR, Nabney IT, Medina-Franco JL, López-Vallejo F. Visualization of molecular fingerprints. J Chem Inf Model. 2011;51(7):1552–1563. doi: 10.1021/ci1004042. PubMed DOI
Agrafiotis DK, Rassokhin DN, Lobanov VS. Multidimensional scaling and visualization of large molecular similarity tables. J Comput Chem. 2001;22(5):488–500. doi: 10.1002/1096-987X(20010415)22:5%3C488::AID-JCC1020%3E3.0.CO;2-4. DOI
Kireeva N, Baskin II, Gaspar HA, Horvath D, Marcou G, Varnek A. Generative topographic mapping (GTM): universal tool for data visualization, structure-activity modeling and dataset comparison. Mol Inf. 2012;31(3–4):301–312. doi: 10.1002/minf.201100163. PubMed DOI
Gaspar HA, Baskin I, Marcou G, Horvath D, Varnek A (2014) Chemical data visualization and analysis with incremental GTM: big data challenge. J Chem Inf Model. doi:10.1021/ci500575y PubMed
Hoksza D, Škoda P, Voršilák M, Svozil D. Molpher: a software framework for systematic chemical space exploration. J Cheminform. 2014;6(1):7. doi: 10.1186/1758-2946-6-7. PubMed DOI PMC
Hu Y, Stumpfe D, Bajorath J (2016) Computational exploration of molecular scaffolds in medicinal chemistry. J Med Chem 5-01746: doi:10.1021/acs.jmedchem.5b01746 PubMed
Wilkens SJ, Janes J, Su AI. HierS: hierarchical scaffold clustering using topological chemical graphs. J Med Chem. 2005;48(9):3182–3193. doi: 10.1021/jm049032d. PubMed DOI
Schuffenhauer A, Ertl P, Roggo S, Wetzel S, Koch MA, Waldmann H. The scaffold tree—visualization of the scaffold universe by hierarchical scaffold classification. J Chem Inf Model. 2007;47(1):47–58. doi: 10.1021/ci600338x. PubMed DOI
Pollock SN, Coutsias EA, Wester MJ, Oprea TI. Scaffold topologies. 1. Exhaustive enumeration up to eight rings. J Chem Inf Model. 2008;48(7):1304–1310. doi: 10.1021/ci7003412. PubMed DOI PMC
Wetzel S, Klein K, Renner S, Rauh D, Oprea TI, Mutzel P, Waldmann H. Interactive exploration of chemical space with scaffold hunter. Nat Chem Biol. 2009;5(8):581–583. doi: 10.1038/nchembio.187. PubMed DOI
Agrafiotis DK, Wiener JJM. Scaffold explorer: an interactive tool for organizing and mining structure-activity data spanning multiple chemotypes. J Med Chem. 2010;53(13):5002–5011. doi: 10.1021/jm1004495. PubMed DOI
Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, Han L, He J, He S, Shoemaker BA, Wang J, Yu B, Zhang J, Bryant SH. PubChem substance and compound databases. Nucleic Acids Res. 2016;44(D1):1202–1213. doi: 10.1093/nar/gkv951. PubMed DOI PMC
Ertl P, Schuffenhauer A, Renner S (2010) The scaffold tree: an efficient navigation in the scaffold universe. In: Chemoinformatics and computational chemical biology. Methods in molecular biology, vol 672. Humana Press. chap 10, pp 245–260. doi:10.1007/978-1-60761-839-3_10 PubMed
Bemis GW, Murcko MA. The properties of known drugs. 1. Molecular frameworks. J Med Chem. 1996;39(15):2887–2893. doi: 10.1021/jm9602928. PubMed DOI
Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, Chang Z, Woolsey J. Drugbank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 2006;34(Database issue):668–672. doi: 10.1093/nar/gkj067. PubMed DOI PMC
Pence HE, Williams A (2010) Chemspider: an online chemical information resource. doi:10.1021/ed100697w
Wester MJ, Pollock SN, Coutsias EA, Allu TK, Muresan S, Oprea TI. Scaffold topologies. 2. Analysis of chemical databases. J Chem Inf Model. 2008;48(7):1311–1324. doi: 10.1021/ci700342h. PubMed DOI PMC
Shneiderman B. Tree visualization with tree-maps: 2-d space-filling approach. ACM Trans Graph. 1992;11(1):92–99. doi: 10.1145/102377.115768. DOI
Bruls M, Huizing K, van Wijk JJ (2000) Squarified treemaps. In: Proceedings of the joint Eurographics and IEEE TCVG symposium on visualization, pp 33–42. doi:10.1007/978-3-7091-6783-0_4
Odersky M, Micheloud S, Mihaylov N, Schinz M, Stenman E, Zenger M et al (2004) An overview of the scala programming language. Technical report
Doeraene S (2013) Scala.js: Type-directed interoperability with dynamically typed languages. Technical report, EPFL. https://infoscience.epfl.ch/record/190834/files/scalajs-paper.pdf