• This record comes from PubMed

Scaffold analysis of PubChem database as background for hierarchical scaffold-based visualization

. 2016 ; 8 () : 74. [epub] 20161229

Status PubMed-not-MEDLINE Language English Country Great Britain, England Media electronic-ecollection

Document type Journal Article

BACKGROUND: Visualization of large molecular datasets is a challenging yet important topic utilised in diverse fields of chemistry ranging from material engineering to drug design. Especially in drug design, modern methods of high-throughput screening generate large amounts of molecular data that call for methods enabling their analysis. One such method is classification of compounds based on their molecular scaffolds, a concept widely used by medicinal chemists to group molecules of similar properties. This classification can then be utilized for intuitive visualization of compounds. RESULTS: In this paper, we propose a scaffold hierarchy as a result of large-scale analysis of the PubChem Compound database. The analysis not only provided insights into scaffold diversity of the PubChem Compound database, but also enables scaffold-based hierarchical visualization of user compound data sets on the background of empirical chemical space, as defined by the PubChem data, or on the background of any other user-defined data set. The visualization is performed by a web based client-server application called Scaffvis. It provides an interactive zoomable tree map visualization of data sets up to hundreds of thousands molecules. Scaffvis is free to use and its source codes have been published under an open source license.Graphical abstract.

See more in PubMed

Oprea T, Gottfries J. Chemography: the art of navigating in chemical space. J Comb Chem. 2001;3(2):157–166. doi: 10.1021/cc0000388. PubMed DOI

Nguyen KT, Blum LC, van Deursen R, Reymond J-L. Classification of organic molecules by molecular quantum numbers. ChemMedChem. 2009;4(11):1803–1805. doi: 10.1002/cmdc.200900317. PubMed DOI

Owen JR, Nabney IT, Medina-Franco JL, López-Vallejo F. Visualization of molecular fingerprints. J Chem Inf Model. 2011;51(7):1552–1563. doi: 10.1021/ci1004042. PubMed DOI

Agrafiotis DK, Rassokhin DN, Lobanov VS. Multidimensional scaling and visualization of large molecular similarity tables. J Comput Chem. 2001;22(5):488–500. doi: 10.1002/1096-987X(20010415)22:5%3C488::AID-JCC1020%3E3.0.CO;2-4. DOI

Kireeva N, Baskin II, Gaspar HA, Horvath D, Marcou G, Varnek A. Generative topographic mapping (GTM): universal tool for data visualization, structure-activity modeling and dataset comparison. Mol Inf. 2012;31(3–4):301–312. doi: 10.1002/minf.201100163. PubMed DOI

Gaspar HA, Baskin I, Marcou G, Horvath D, Varnek A (2014) Chemical data visualization and analysis with incremental GTM: big data challenge. J Chem Inf Model. doi:10.1021/ci500575y PubMed

Hoksza D, Škoda P, Voršilák M, Svozil D. Molpher: a software framework for systematic chemical space exploration. J Cheminform. 2014;6(1):7. doi: 10.1186/1758-2946-6-7. PubMed DOI PMC

Hu Y, Stumpfe D, Bajorath J (2016) Computational exploration of molecular scaffolds in medicinal chemistry. J Med Chem 5-01746: doi:10.1021/acs.jmedchem.5b01746 PubMed

Wilkens SJ, Janes J, Su AI. HierS: hierarchical scaffold clustering using topological chemical graphs. J Med Chem. 2005;48(9):3182–3193. doi: 10.1021/jm049032d. PubMed DOI

Schuffenhauer A, Ertl P, Roggo S, Wetzel S, Koch MA, Waldmann H. The scaffold tree—visualization of the scaffold universe by hierarchical scaffold classification. J Chem Inf Model. 2007;47(1):47–58. doi: 10.1021/ci600338x. PubMed DOI

Pollock SN, Coutsias EA, Wester MJ, Oprea TI. Scaffold topologies. 1. Exhaustive enumeration up to eight rings. J Chem Inf Model. 2008;48(7):1304–1310. doi: 10.1021/ci7003412. PubMed DOI PMC

Wetzel S, Klein K, Renner S, Rauh D, Oprea TI, Mutzel P, Waldmann H. Interactive exploration of chemical space with scaffold hunter. Nat Chem Biol. 2009;5(8):581–583. doi: 10.1038/nchembio.187. PubMed DOI

Agrafiotis DK, Wiener JJM. Scaffold explorer: an interactive tool for organizing and mining structure-activity data spanning multiple chemotypes. J Med Chem. 2010;53(13):5002–5011. doi: 10.1021/jm1004495. PubMed DOI

Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, Han L, He J, He S, Shoemaker BA, Wang J, Yu B, Zhang J, Bryant SH. PubChem substance and compound databases. Nucleic Acids Res. 2016;44(D1):1202–1213. doi: 10.1093/nar/gkv951. PubMed DOI PMC

Ertl P, Schuffenhauer A, Renner S (2010) The scaffold tree: an efficient navigation in the scaffold universe. In: Chemoinformatics and computational chemical biology. Methods in molecular biology, vol 672. Humana Press. chap 10, pp 245–260. doi:10.1007/978-1-60761-839-3_10 PubMed

Bemis GW, Murcko MA. The properties of known drugs. 1. Molecular frameworks. J Med Chem. 1996;39(15):2887–2893. doi: 10.1021/jm9602928. PubMed DOI

Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, Chang Z, Woolsey J. Drugbank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 2006;34(Database issue):668–672. doi: 10.1093/nar/gkj067. PubMed DOI PMC

Pence HE, Williams A (2010) Chemspider: an online chemical information resource. doi:10.1021/ed100697w

Wester MJ, Pollock SN, Coutsias EA, Allu TK, Muresan S, Oprea TI. Scaffold topologies. 2. Analysis of chemical databases. J Chem Inf Model. 2008;48(7):1311–1324. doi: 10.1021/ci700342h. PubMed DOI PMC

Shneiderman B. Tree visualization with tree-maps: 2-d space-filling approach. ACM Trans Graph. 1992;11(1):92–99. doi: 10.1145/102377.115768. DOI

Bruls M, Huizing K, van Wijk JJ (2000) Squarified treemaps. In: Proceedings of the joint Eurographics and IEEE TCVG symposium on visualization, pp 33–42. doi:10.1007/978-3-7091-6783-0_4

Odersky M, Micheloud S, Mihaylov N, Schinz M, Stenman E, Zenger M et al (2004) An overview of the scala programming language. Technical report

Doeraene S (2013) Scala.js: Type-directed interoperability with dynamically typed languages. Technical report, EPFL. https://infoscience.epfl.ch/record/190834/files/scalajs-paper.pdf

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...