BACKGROUND: Various information systems for medical curriculum mapping and harmonization have been developed and successfully applied to date. However, the methods for exploiting the datasets captured inside the systems are rather lacking. METHOD: We reviewed the existing medical terminologies, nomenclatures, coding and classification systems in order to select the most suitable one and apply it in delivering visual analytic tools and reports for the benefit of medical curriculum designers and innovators. RESULTS: A formal description of a particular curriculum of general medicine is based on 1347 learning units covering 7075 learning outcomes. Two data-analytical reports have been developed and discussed, showing how the curriculum is consistent with the MeSH thesaurus and how the MeSH thesaurus can be used to demonstrate interconnectivity of the curriculum through association analysis. CONCLUSION: Although the MeSH thesaurus is designed mainly to index medical literature and support searching through bibliographic databases, we have proved its use in medical curriculum mapping as being beneficial for curriculum designers and innovators. The presented approach can be followed wherever needed to identify all the mandatory components used for transparent and comprehensive overview of medical curriculum data.
OBJECTIVE: We investigate machine translation (MT) of user search queries in the context of cross-lingual information retrieval (IR) in the medical domain. The main focus is on techniques to adapt MT to increase translation quality; however, we also explore MT adaptation to improve effectiveness of cross-lingual IR. METHODS AND DATA: Our MT system is Moses, a state-of-the-art phrase-based statistical machine translation system. The IR system is based on the BM25 retrieval model implemented in the Lucene search engine. The MT techniques employed in this work include in-domain training and tuning, intelligent training data selection, optimization of phrase table configuration, compound splitting, and exploiting synonyms as translation variants. The IR methods include morphological normalization and using multiple translation variants for query expansion. The experiments are performed and thoroughly evaluated on three language pairs: Czech-English, German-English, and French-English. MT quality is evaluated on data sets created within the Khresmoi project and IR effectiveness is tested on the CLEF eHealth 2013 data sets. RESULTS: The search query translation results achieved in our experiments are outstanding - our systems outperform not only our strong baselines, but also Google Translate and Microsoft Bing Translator in direct comparison carried out on all the language pairs. The baseline BLEU scores increased from 26.59 to 41.45 for Czech-English, from 23.03 to 40.82 for German-English, and from 32.67 to 40.82 for French-English. This is a 55% improvement on average. In terms of the IR performance on this particular test collection, a significant improvement over the baseline is achieved only for French-English. For Czech-English and German-English, the increased MT quality does not lead to better IR results. CONCLUSIONS: Most of the MT techniques employed in our experiments improve MT of medical search queries. Especially the intelligent training data selection proves to be very successful for domain adaptation of MT. Certain improvements are also obtained from German compound splitting on the source language side. Translation quality, however, does not appear to correlate with the IR performance - better translation does not necessarily yield better retrieval. We discuss in detail the contribution of the individual techniques and state-of-the-art features and provide future research directions.
Sdílení výpočetní kapacity ve vysokorychlostních počítačových sítích otevírá nové možnosti pro komplexní zpracování dat v oblasti biomedicíny. Nutnou podmínkou pro takové zpracování dat je však existence dostatečně flexibilních nástrojů pro jejich indexování. vývoj posledních let naznačuje, že takovými nástroji by mohly být ontologické systémy, navazující na systémy klasifikace medicínských pojmů.
1 elektronický optický disk (CD-ROM) : barev. ; 13 cm
- MeSH
- Subject Headings MeSH
- Dictionaries, Medical as Topic MeSH
- Unified Medical Language System MeSH
- Publication type
- Dictionary MeSH
- Conspectus
- Knihovnictví. Informatika
- NML Fields
- knihovnictví, informační věda a muzeologie
- lékařství
- NML Publication type
- CD-ROM
Medical subject headings, ISSN 1047-5711
I-167, 1309 s. ; 32 cm
1-1. 803 s. ; 32 cm
Studies in health technology and informatics, ISSN 0926-9630 Vol. 53
XII, 82 s. ; 24 cm