-
Je něco špatně v tomto záznamu ?
An optimum data warehouse for epidemiological analysis using the National database of health insurance claims of Japan
Tomohide Iwao, Genta Kato, Shigeru Ohtsuru, Eiji Kondoh, Takeo Nakayama, Tomohiro Kuroda
Jazyk angličtina Země Česko
- MeSH
- analýza dat * MeSH
- big data MeSH
- databáze jako téma MeSH
- epidemiologické studie * MeSH
- lidé MeSH
- poskytování zdravotní péče MeSH
- všeobecné zdravotní pojištění MeSH
- Check Tag
- lidé MeSH
- Geografické názvy
- Japonsko MeSH
Background: While administrative databases for health care are increasingly used as research tools, such databases generally contain only health insurance claims data, the contents of which are insufficient for conducting epidemiological research. Creating a dataset appropriate for specific analysis requires technical expertise and familiarity with data analysis. The aim of our research is to develop a data warehouse (DW) accessible to researchers of epidemiology without this expertise.Methods: We began by adding commonly used attributes in the epidemiological field to the National Database of Health Insurance Claims of Japan (NDB), to construct a Research Question Oriented DB. Secondly, we developed a versatile analysis unit schema by which the Research Question Oriented DW was reconstructed as per-patient units, covering demographics including sex, age group etc. We then proposed a pattern relational calculus by which research-specific attributes can be added without expert knowledge of SQL. Finally, we applied the DW in two epidemiological studies.Results: In both studies, the coverage of attributes constructed only by the versatile analysis unit schema was limited. The versatile analysis unit schema covered 12% (3/25) of the attributes used for the one study as well as 15% (3/20) in the other study. On the other hand, the pattern relational calculus we proposed covered all remaining attributes which researchers used for their study.Conclusion: As the versatile analysis unit schema and the pattern relational calculus were able to cover all attributes used in the two epidemiological studies, this shows that even within a limited scope, our method allows researchers who have little knowledge of SQL to tackle respective epidemiological study.Abbreviations and Terminologies: NDB-SD: NDB Sampling Data set; DW: Data Warehouse; Shema: design of attributes in relations in the relational model theory; Relation: table with no duplicate tuple; Attribute: column name or variable name in relations; Primary key: one or more attributes that uniquely identify each tuple in a relation; Tuple: combination of attributes in a relation, almost the same meaning as row; Tuple relational calculus: logical expression used in the relational model theory; SQL: database language based on the relational model theory.
Department of Gynecology and Obstetrics Kyoto University Hospital Japan5
Department of Primary Care and Emergency Medicine Kyoto University Hospital Japan
Solutions Center for Health Insurance Claims Kyoto University Hospital Japan
Citace poskytuje Crossref.org
Literatura
- 000
- 00000naa a2200000 a 4500
- 001
- bmc20009452
- 003
- CZ-PrNML
- 005
- 20220830134612.0
- 007
- cr|cn|
- 008
- 200623s2019 xr fs 000 0|eng||
- 009
- eAR
- 024 7_
- $a 10.24105/ejbi.2019.15.3.1 $2 doi
- 040 __
- $a ABA008 $d ABA008 $e AACR2 $b cze
- 041 0_
- $a eng
- 044 __
- $a xr
- 100 1_
- $a Division of Medical Information Technology and Administration Planning, Kyoto University Hospital, Japan
- 245 13
- $a An optimum data warehouse for epidemiological analysis using the National database of health insurance claims of Japan / $c Tomohide Iwao, Genta Kato, Shigeru Ohtsuru, Eiji Kondoh, Takeo Nakayama, Tomohiro Kuroda
- 504 __
- $a Literatura
- 520 9_
- $a Background: While administrative databases for health care are increasingly used as research tools, such databases generally contain only health insurance claims data, the contents of which are insufficient for conducting epidemiological research. Creating a dataset appropriate for specific analysis requires technical expertise and familiarity with data analysis. The aim of our research is to develop a data warehouse (DW) accessible to researchers of epidemiology without this expertise.Methods: We began by adding commonly used attributes in the epidemiological field to the National Database of Health Insurance Claims of Japan (NDB), to construct a Research Question Oriented DB. Secondly, we developed a versatile analysis unit schema by which the Research Question Oriented DW was reconstructed as per-patient units, covering demographics including sex, age group etc. We then proposed a pattern relational calculus by which research-specific attributes can be added without expert knowledge of SQL. Finally, we applied the DW in two epidemiological studies.Results: In both studies, the coverage of attributes constructed only by the versatile analysis unit schema was limited. The versatile analysis unit schema covered 12% (3/25) of the attributes used for the one study as well as 15% (3/20) in the other study. On the other hand, the pattern relational calculus we proposed covered all remaining attributes which researchers used for their study.Conclusion: As the versatile analysis unit schema and the pattern relational calculus were able to cover all attributes used in the two epidemiological studies, this shows that even within a limited scope, our method allows researchers who have little knowledge of SQL to tackle respective epidemiological study.Abbreviations and Terminologies: NDB-SD: NDB Sampling Data set; DW: Data Warehouse; Shema: design of attributes in relations in the relational model theory; Relation: table with no duplicate tuple; Attribute: column name or variable name in relations; Primary key: one or more attributes that uniquely identify each tuple in a relation; Tuple: combination of attributes in a relation, almost the same meaning as row; Tuple relational calculus: logical expression used in the relational model theory; SQL: database language based on the relational model theory.
- 650 _7
- $a lidé $7 D006801 $2 czmesh
- 650 17
- $a analýza dat $7 D000078332 $2 czmesh
- 650 17
- $a epidemiologické studie $7 D016021 $2 czmesh
- 650 _7
- $a všeobecné zdravotní pojištění $7 D019472 $2 czmesh
- 650 _7
- $a poskytování zdravotní péče $7 D003695 $2 czmesh
- 650 _7
- $a databáze jako téma $7 D019992 $2 czmesh
- 650 _7
- $a big data $7 D000077558 $2 czmesh
- 651 _7
- $a Japonsko $7 D007564 $2 czmesh
- 700 1_
- $a Kato, Genta $u 2Solutions Center for Health Insurance Claims, Kyoto University Hospital, Japan
- 700 1_
- $a Ohtsuru, Shigeru $u 3Department of Primary Care and Emergency Medicine, Kyoto University Hospital, Japan
- 700 1_
- $a Kondoh, Eiji $u 4Department of Gynecology and Obstetrics, Kyoto University Hospital, Japan5
- 700 1_
- $a Nakayama, Takeo $u Department of Health Informatics, Graduate School of Medicine and Public Health, Kyoto University, Japan
- 700 1_
- $a Kuroda, Tomohiro $u Division of Medical Information Technology and Administration Planning, Kyoto University Hospital, Japan
- 773 0_
- $t European journal for biomedical informatics $x 1801-5603 $g Roč. 15, č. 3 (2019), s. 31-42 $w MED00173462
- 856 41
- $u http://www.ejbi.org/ $y domovská stránka časopisu - plný text volně přístupný
- 910 __
- $a ABA008 $b online $y p $z 0
- 990 __
- $a 20200623133828 $b ABA008
- 991 __
- $a 20220830134608 $b ABA008
- 999 __
- $a ok $b bmc $g 1537545 $s 1099536
- BAS __
- $a 3 $a 4
- BMC __
- $a 2019 $b 15 $c 3 $d 31-42 $i 1801-5603 $m European Journal for Biomedical Informatics $n Eur. J. Biomed. Inform. (Praha) $x MED00173462
- LZP __
- $c NLK183 $d 20220830 $a NLK 2020-20/dk