Interoperable chemical structure search service

. 2019 Jun 28 ; 11 (1) : 45. [epub] 20190628

Status PubMed-not-MEDLINE Jazyk angličtina Země Anglie, Velká Británie Médium electronic

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/pmid31254167

Grantová podpora
LM2015047 Ministerstvo Školství, Mládeže a Tělovýchovy
LM2015042 Ministerstvo Školství, Mládeže a Tělovýchovy
LM2015085 Ministerstvo Školství, Mládeže a Tělovýchovy
61388963 Ústav Organické Chemie a Biochemie, Akademie Věd České Republiky

Odkazy

PubMed 31254167
PubMed Central PMC6599361
DOI 10.1186/s13321-019-0367-2
PII: 10.1186/s13321-019-0367-2
Knihovny.cz E-zdroje

MOTIVATION: The existing connections between large databases of chemicals, proteins, metabolites and assays offer valuable resources for research in fields ranging from drug design to metabolomics. Transparent search across multiple databases provides a way to efficiently utilize these resources. To simplify such searches, many databases have adopted semantic technologies that allow interoperable querying of the datasets using SPARQL query language. However, the interoperable interfaces of the chemical databases still lack the functionality of structure-driven chemical search, which is a fundamental method of data discovery in the chemical search space. RESULTS: We present a SPARQL service that augments existing semantic services by making interoperable substructure and similarity searches in small-molecule databases possible. The service thus offers new possibilities for querying interoperable databases, and simplifies writing of heterogeneous queries that include chemical-structure search terms. AVAILABILITY: The service is freely available and accessible using a standard SPARQL endpoint interface. The service documentation and user-oriented demonstration interfaces that allow quick explorative querying of datasets are available at https://idsm.elixir-czech.cz .

Erratum v

PubMed

Zobrazit více v PubMed

Berners-Lee T, Hendler J, Lassila O. The semantic web. Sci Am. 2001;284(5):34–43. doi: 10.1038/scientificamerican0501-34. PubMed DOI

Jacob EK. Ontologies and the semantic web. Bull Am Soc Inf Sci Technol. 2003;29(4):19–22. doi: 10.1002/bult.283. DOI

World Wide Web Consortium et al (2014) RDF 1.1 Primer. World Wide Web Consortium

World Wide Web Consortium et al (2014) RDF 1.1 concepts and abstract syntax. World Wide Web Consortium

McBride B. The resource description framework (RDF) and its vocabulary description language RDFS. In: Staab S, Studer R, editors. Handbook on ontologies. Berlin: Springer; 2004. pp. 51–65.

Allemang D, Hendler J. Semantic web for the working ontologist: effective modeling in RDFS and OWL. Amsterdam: Elsevier; 2011.

World Wide Web Consortium et al (2013) SPARQL 1.1 Query language: W3C recommendation. World Wide Web Consortium

World Wide Web Consortium et al (2013) SPARQL 1.1 protocol: W3C recommendation. World Wide Web Consortium

World Wide Web Consortium et al (2013) SPARQL 1.1 federated query: W3C recommendation. World Wide Web Consortium

Buil-Aranda C, Arenas M, Corcho O (2011) Semantics and optimization of the SPARQL 1.1 federation extension. In: Extended semantic web conference. Springer, pp 1–15

Kratochvíl M, Vondrášek J, Galgonek J. Sachem: a chemical cartridge for high-performance substructure search. J Cheminform. 2018;10(1):27. doi: 10.1186/s13321-018-0282-y. PubMed DOI PMC

Mapping SQL Data to Linked Data Views; 2016. Online. http://vos.openlinksw.com/owiki/wiki/VOS/VOSSQL2RDF. Accessed 2019-05-01

World Wide Web Consortium et al (2013) SPARQL 1.1 query results JSON format: W3C recommendation. World Wide Web Consortium

World Wide Web Consortium et al (2013) SPARQL 1.1 query results XML format, 2nd edn. W3C recommendation. World Wide Web Consortium

World Wide Web Consortium et al (2013) SPARQL 1.1 query results CSV and TSV formats: W3C recommendation. World Wide Web Consortium

Galgonek J, Hurt T, Michlíková V, Onderka P, Schwarz J, Vondrášek J. Advanced SPARQL querying in small molecule databases. J Cheminform. 2016;8(1):31. doi: 10.1186/s13321-016-0144-4. PubMed DOI PMC

Thalheim T, Vollmer A, Ebert RU, Kühne R, Schüürmann G. Tautomer identification and tautomer structure generation based on the InChI code. J Chem Inf Model. 2010;50(7):1223–1232. doi: 10.1021/ci1001179. PubMed DOI

Choi SS, Cha SH, Tappert CC. A survey of binary similarity and distance measures. J Syst Cybern Inform. 2010;8(1):43–48.

Rogers D, Hahn M. Extended-connectivity fingerprints. J Chem Inf Model. 2010;50(5):742–754. doi: 10.1021/ci100050t. PubMed DOI

Law V, Knox C, Djoumbou Y, Jewison T, Guo AC, Liu Y, et al. DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Res. 2013;42(D1):D1091–D1097. doi: 10.1093/nar/gkt1068. PubMed DOI PMC

Hastings J, Owen G, Dekker A, Ennis M, Kale N, Muthukrishnan V, et al. ChEBI in 2016: improved services and an expanding collection of metabolites. Nucleic Acids Res. 2015;44(D1):D1214–D1219. doi: 10.1093/nar/gkv1031. PubMed DOI PMC

Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2011;40(D1):D1100–D1107. doi: 10.1093/nar/gkr777. PubMed DOI PMC

Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, et al. PubChem substance and compound databases. Nucleic Acids Res. 2015;44(D1):D1202–D1213. doi: 10.1093/nar/gkv951. PubMed DOI PMC

Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, et al. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 2004;32(Suppl 1):D115–D119. doi: 10.1093/nar/gkh131. PubMed DOI PMC

Fourches D, Muratov E, Tropsha A. Trust, but verify: on the importance of chemical structure curation in cheminformatics and QSAR modeling research. J Chem Inf Model. 2010;50(7):1189–1204. doi: 10.1021/ci100176x. PubMed DOI PMC

Warr WA. Tautomerism in chemical information management systems. J Comput Aided Mol Des. 2010;24(6–7):497–520. doi: 10.1007/s10822-010-9338-4. PubMed DOI

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...