Advanced SPARQL querying in small molecule databases

. 2016 ; 8 () : 31. [epub] 20160606

Status PubMed-not-MEDLINE Jazyk angličtina Země Velká Británie, Anglie Médium electronic-ecollection

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/pmid27275187

BACKGROUND: In recent years, the Resource Description Framework (RDF) and the SPARQL query language have become more widely used in the area of cheminformatics and bioinformatics databases. These technologies allow better interoperability of various data sources and powerful searching facilities. However, we identified several deficiencies that make usage of such RDF databases restrictive or challenging for common users. RESULTS: We extended a SPARQL engine to be able to use special procedures inside SPARQL queries. This allows the user to work with data that cannot be simply precomputed and thus cannot be directly stored in the database. We designed an algorithm that checks a query against data ontology to identify possible user errors. This greatly improves query debugging. We also introduced an approach to visualize retrieved data in a user-friendly way, based on templates describing visualizations of resource classes. To integrate all of our approaches, we developed a simple web application. CONCLUSIONS: Our system was implemented successfully, and we demonstrated its usability on the ChEBI database transformed into RDF form. To demonstrate procedure call functions, we employed compound similarity searching based on OrChem. The application is publicly available at https://bioinfo.uochb.cas.cz/projects/chemRDF.

Zobrazit více v PubMed

Williams AJ. Public chemical compound databases. Curr Opin Drug Discov Dev. 2008;11:393–404. PubMed

ChemSpider. http://www.chemspider.com

Gobbi A, Lee M-L. Handling of tautomerism and stereochemistry in compound registration. J Chem Inf Model. 2011;52:285–292. doi: 10.1021/ci200330x. PubMed DOI

Martin E, Monge A, Duret J-A, Gualandi F, Peitsch MC, Pospisil P. Building an R&D chemical registration system. J Cheminform. 2012;4:11. doi: 10.1186/1758-2946-4-11. PubMed DOI PMC

RDF 1.1 Primer. http://www.w3.org/TR/2014/NOTE-rdf11-primer-20140624/

Williams AJ, Harland L, Groth P, Pettifer S, Chichester C, Willighagen EL, Evelo CT, Blomberg N, Ecker G, Goble C, Mons B. Open PHACTS: semantic interoperability for drug discovery. Drug Discov Today. 2012;17:1188–1198. doi: 10.1016/j.drudis.2012.05.016. PubMed DOI

Chen B, Dong X, Jiao D, Wang H, Zhu Q, Ding Y, Wild DJ. Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systems chemical biology data. BMC Bioinform. 2010;11:255. doi: 10.1186/1471-2105-11-255. PubMed DOI PMC

PubChemRDF release notes. http://pubchem.ncbi.nlm.nih.gov/rdf/

ChemSpider Linked Data. http://rdf.chemspider.com

RDF Platform. http://www.ebi.ac.uk/rdf/

neXtProt—exploring the universe of human proteins. http://www.nextprot.org

De Matos P, Alcántara R, Dekker A, Ennis M, Hastings J, Haug K, Spiteri I, Turner S, Steinbeck C. Chemical entities of biological interest: an update. Nucleic Acids Res. 2010;38(Database issue):D249–D254. doi: 10.1093/nar/gkp886. PubMed DOI PMC

RDF 1.1 Concepts and abstract syntax. http://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/

OWL Web Ontology Language reference. http://www.w3.org/TR/2004/REC-owl-ref-20040210/

OWL 2 Web Ontology Language document overview, 2nd edn. http://www.w3.org/TR/2012/REC-owl2-overview-20121211/

OWL 2 Web Ontology Language primer, 2nd edn. http://www.w3.org/TR/2012/REC-owl2-primer-20121211/

SPARQL 1.1 Query language. http://www.w3.org/TR/2013/REC-sparql11-query-20130321/

Apache Jena. http://jena.apache.org

Sesame. http://rdf4j.org

Murray C (2012) Oracle® database semantic technologies developer’s Guide 11g Release 2 (11.2)

OpenLink Virtuoso. http://virtuoso.openlinksw.com

Ontotext GraphDB. http://ontotext.com/products/ontotext-graphdb/

Apache Jena: Extensions in ARQ. http://jena.apache.org/documentation/query/extension.html#property-functions

Apache Jena: ARQ—writing property functions. http://jena.apache.org/documentation/query/writing_propfuncs.html

Parr T (2013) The definitive ANTLR 4 reference. Pragmatic Bookshelf

The Apache Velocity project: user guide. http://velocity.apache.org/engine/releases/velocity-1.5/user-guide.html

Google Web Toolkit. http://www.gwtproject.org

CodeMirror. http://codemirror.net

Galgonek J, Vondrášek J. On InChI and evaluating the quality of cross-reference links. J Cheminform. 2014;6:15. doi: 10.1186/1758-2946-6-15. PubMed DOI PMC

ChemAxon JChem. http://www.chemaxon.com/products/jchem-base/

OrChem. http://orchem.sourceforge.net

Weininger D, Weininger A, Weininger JL. SMILES. 2. Algorithm for generation of unique SMILES notation. J Chem Inf Comput Sci. 1989;29:97–101. doi: 10.1021/ci00062a008. DOI

Accelrys (2011) CTfile Formats

Nejnovějších 20 citací...

Zobrazit více v
Medvik | PubMed

IDSM ChemWebRDF: SPARQLing small-molecule datasets

. 2021 May 12 ; 13 (1) : 38. [epub] 20210512

Interoperable chemical structure search service

. 2019 Jun 28 ; 11 (1) : 45. [epub] 20190628

Najít záznam

Citační ukazatele

Nahrávání dat ...

    Možnosti archivace