Interoperable chemical structure search service
Status PubMed-not-MEDLINE Jazyk angličtina Země Anglie, Velká Británie Médium electronic
Typ dokumentu časopisecké články
Grantová podpora
LM2015047
Ministerstvo Školství, Mládeže a Tělovýchovy
LM2015042
Ministerstvo Školství, Mládeže a Tělovýchovy
LM2015085
Ministerstvo Školství, Mládeže a Tělovýchovy
61388963
Ústav Organické Chemie a Biochemie, Akademie Věd České Republiky
PubMed
31254167
PubMed Central
PMC6599361
DOI
10.1186/s13321-019-0367-2
PII: 10.1186/s13321-019-0367-2
Knihovny.cz E-zdroje
- Klíčová slova
- Interoperability, Linked data, Small molecule databases, Substructure search,
- Publikační typ
- časopisecké články MeSH
MOTIVATION: The existing connections between large databases of chemicals, proteins, metabolites and assays offer valuable resources for research in fields ranging from drug design to metabolomics. Transparent search across multiple databases provides a way to efficiently utilize these resources. To simplify such searches, many databases have adopted semantic technologies that allow interoperable querying of the datasets using SPARQL query language. However, the interoperable interfaces of the chemical databases still lack the functionality of structure-driven chemical search, which is a fundamental method of data discovery in the chemical search space. RESULTS: We present a SPARQL service that augments existing semantic services by making interoperable substructure and similarity searches in small-molecule databases possible. The service thus offers new possibilities for querying interoperable databases, and simplifies writing of heterogeneous queries that include chemical-structure search terms. AVAILABILITY: The service is freely available and accessible using a standard SPARQL endpoint interface. The service documentation and user-oriented demonstration interfaces that allow quick explorative querying of datasets are available at https://idsm.elixir-czech.cz .
Zobrazit více v PubMed
Berners-Lee T, Hendler J, Lassila O. The semantic web. Sci Am. 2001;284(5):34–43. doi: 10.1038/scientificamerican0501-34. PubMed DOI
Jacob EK. Ontologies and the semantic web. Bull Am Soc Inf Sci Technol. 2003;29(4):19–22. doi: 10.1002/bult.283. DOI
World Wide Web Consortium et al (2014) RDF 1.1 Primer. World Wide Web Consortium
World Wide Web Consortium et al (2014) RDF 1.1 concepts and abstract syntax. World Wide Web Consortium
McBride B. The resource description framework (RDF) and its vocabulary description language RDFS. In: Staab S, Studer R, editors. Handbook on ontologies. Berlin: Springer; 2004. pp. 51–65.
Allemang D, Hendler J. Semantic web for the working ontologist: effective modeling in RDFS and OWL. Amsterdam: Elsevier; 2011.
World Wide Web Consortium et al (2013) SPARQL 1.1 Query language: W3C recommendation. World Wide Web Consortium
World Wide Web Consortium et al (2013) SPARQL 1.1 protocol: W3C recommendation. World Wide Web Consortium
World Wide Web Consortium et al (2013) SPARQL 1.1 federated query: W3C recommendation. World Wide Web Consortium
Buil-Aranda C, Arenas M, Corcho O (2011) Semantics and optimization of the SPARQL 1.1 federation extension. In: Extended semantic web conference. Springer, pp 1–15
Kratochvíl M, Vondrášek J, Galgonek J. Sachem: a chemical cartridge for high-performance substructure search. J Cheminform. 2018;10(1):27. doi: 10.1186/s13321-018-0282-y. PubMed DOI PMC
Mapping SQL Data to Linked Data Views; 2016. Online. http://vos.openlinksw.com/owiki/wiki/VOS/VOSSQL2RDF. Accessed 2019-05-01
World Wide Web Consortium et al (2013) SPARQL 1.1 query results JSON format: W3C recommendation. World Wide Web Consortium
World Wide Web Consortium et al (2013) SPARQL 1.1 query results XML format, 2nd edn. W3C recommendation. World Wide Web Consortium
World Wide Web Consortium et al (2013) SPARQL 1.1 query results CSV and TSV formats: W3C recommendation. World Wide Web Consortium
Galgonek J, Hurt T, Michlíková V, Onderka P, Schwarz J, Vondrášek J. Advanced SPARQL querying in small molecule databases. J Cheminform. 2016;8(1):31. doi: 10.1186/s13321-016-0144-4. PubMed DOI PMC
Thalheim T, Vollmer A, Ebert RU, Kühne R, Schüürmann G. Tautomer identification and tautomer structure generation based on the InChI code. J Chem Inf Model. 2010;50(7):1223–1232. doi: 10.1021/ci1001179. PubMed DOI
Choi SS, Cha SH, Tappert CC. A survey of binary similarity and distance measures. J Syst Cybern Inform. 2010;8(1):43–48.
Rogers D, Hahn M. Extended-connectivity fingerprints. J Chem Inf Model. 2010;50(5):742–754. doi: 10.1021/ci100050t. PubMed DOI
Law V, Knox C, Djoumbou Y, Jewison T, Guo AC, Liu Y, et al. DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Res. 2013;42(D1):D1091–D1097. doi: 10.1093/nar/gkt1068. PubMed DOI PMC
Hastings J, Owen G, Dekker A, Ennis M, Kale N, Muthukrishnan V, et al. ChEBI in 2016: improved services and an expanding collection of metabolites. Nucleic Acids Res. 2015;44(D1):D1214–D1219. doi: 10.1093/nar/gkv1031. PubMed DOI PMC
Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2011;40(D1):D1100–D1107. doi: 10.1093/nar/gkr777. PubMed DOI PMC
Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, et al. PubChem substance and compound databases. Nucleic Acids Res. 2015;44(D1):D1202–D1213. doi: 10.1093/nar/gkv951. PubMed DOI PMC
Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, et al. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 2004;32(Suppl 1):D115–D119. doi: 10.1093/nar/gkh131. PubMed DOI PMC
Fourches D, Muratov E, Tropsha A. Trust, but verify: on the importance of chemical structure curation in cheminformatics and QSAR modeling research. J Chem Inf Model. 2010;50(7):1189–1204. doi: 10.1021/ci100176x. PubMed DOI PMC
Warr WA. Tautomerism in chemical information management systems. J Comput Aided Mol Des. 2010;24(6–7):497–520. doi: 10.1007/s10822-010-9338-4. PubMed DOI
The IDSM mass spectrometry extension: searching mass spectra using SPARQL
The LOTUS initiative for open knowledge management in natural products research
IDSM ChemWebRDF: SPARQLing small-molecule datasets
Correction to: Interoperable chemical structure search service