-
Something wrong with this record ?
Czech medical coding assistant based on transformer networks
L. Lenc, J. Martínek, J. Baloun, P. Přibáň, M. Prantl, SE. Taylor, P. Král, J. Kyliš
Language English Country United States
Document type Journal Article
- MeSH
- Electronic Health Records MeSH
- Clinical Coding * MeSH
- Humans MeSH
- International Classification of Diseases * MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Geographicals
- Czech Republic MeSH
The International Classification of Diseases (ICD) hierarchical taxonomy is used for so-called clinical coding of medical reports, typically presented in unstructured text. In the Czech Republic, it is currently carried out manually by a so-called clinical coder. However, due to the human factor, this process is error-prone and expensive. The coder needs to be properly trained and spends significant effort on each report, leading to occasional mistakes. The main goal of this paper is to propose and implement a system that serves as an assistant to the coder and automatically predicts diagnosis codes. These predictions are then presented to the coder for approval or correction, aiming to enhance efficiency and accuracy. We consider two classification tasks: main (principal) diagnosis; and all diagnoses. Crucial requirements for the implementation include minimal memory consumption, generality, ease of portability, and sustainability. The main contribution lies in the proposal and evaluation of ICD classification models for the Czech language with relatively few training parameters, allowing swift utilisation on the prevalent computer systems within Czech hospitals and enabling easy retraining or fine-tuning with newly available data. First, we introduce a small transformer-based model for each task followed by the design of a transformer-based "Four-headed" model incorporating four distinct classification heads. This model achieves comparable, sometimes even better results, against four individual models. Moreover this novel model significantly economises memory usage and learning time. We also show that our models achieve comparable results against state-of-the-art English models on the Mimic IV dataset even though our models are significantly smaller.
References provided by Crossref.org
- 000
- 00000naa a2200000 a 4500
- 001
- bmc24019534
- 003
- CZ-PrNML
- 005
- 20241024110755.0
- 007
- ta
- 008
- 241015e20240527xxu f 000 0|eng||
- 009
- AR
- 024 7_
- $a 10.1016/j.compbiomed.2024.108672 $2 doi
- 035 __
- $a (PubMed)38875906
- 040 __
- $a ABA008 $b cze $d ABA008 $e AACR2
- 041 0_
- $a eng
- 044 __
- $a xxu
- 100 1_
- $a Lenc, Ladislav $u Dept. of Computer Science & Engineering, Faculty of Applied Sciences, University of West Bohemia, Univerzitni 8, Plzeň, 30100, Czech Republic; NTIS - New Technologies for the Information Society, Faculty of Applied Sciences, University of West Bohemia, Univerzitni 8, Plzeň, 30100, Czech Republic. Electronic address: llenc@kiv.zcu.cz
- 245 10
- $a Czech medical coding assistant based on transformer networks / $c L. Lenc, J. Martínek, J. Baloun, P. Přibáň, M. Prantl, SE. Taylor, P. Král, J. Kyliš
- 520 9_
- $a The International Classification of Diseases (ICD) hierarchical taxonomy is used for so-called clinical coding of medical reports, typically presented in unstructured text. In the Czech Republic, it is currently carried out manually by a so-called clinical coder. However, due to the human factor, this process is error-prone and expensive. The coder needs to be properly trained and spends significant effort on each report, leading to occasional mistakes. The main goal of this paper is to propose and implement a system that serves as an assistant to the coder and automatically predicts diagnosis codes. These predictions are then presented to the coder for approval or correction, aiming to enhance efficiency and accuracy. We consider two classification tasks: main (principal) diagnosis; and all diagnoses. Crucial requirements for the implementation include minimal memory consumption, generality, ease of portability, and sustainability. The main contribution lies in the proposal and evaluation of ICD classification models for the Czech language with relatively few training parameters, allowing swift utilisation on the prevalent computer systems within Czech hospitals and enabling easy retraining or fine-tuning with newly available data. First, we introduce a small transformer-based model for each task followed by the design of a transformer-based "Four-headed" model incorporating four distinct classification heads. This model achieves comparable, sometimes even better results, against four individual models. Moreover this novel model significantly economises memory usage and learning time. We also show that our models achieve comparable results against state-of-the-art English models on the Mimic IV dataset even though our models are significantly smaller.
- 650 _2
- $a lidé $7 D006801
- 650 12
- $a klinické kódování $7 D059019
- 650 12
- $a mezinárodní klasifikace nemocí $7 D038801
- 650 _2
- $a elektronické zdravotní záznamy $7 D057286
- 651 _2
- $a Česká republika $7 D018153
- 655 _2
- $a časopisecké články $7 D016428
- 700 1_
- $a Martínek, Jiří $u Dept. of Computer Science & Engineering, Faculty of Applied Sciences, University of West Bohemia, Univerzitni 8, Plzeň, 30100, Czech Republic; NTIS - New Technologies for the Information Society, Faculty of Applied Sciences, University of West Bohemia, Univerzitni 8, Plzeň, 30100, Czech Republic
- 700 1_
- $a Baloun, Josef $u Dept. of Computer Science & Engineering, Faculty of Applied Sciences, University of West Bohemia, Univerzitni 8, Plzeň, 30100, Czech Republic; NTIS - New Technologies for the Information Society, Faculty of Applied Sciences, University of West Bohemia, Univerzitni 8, Plzeň, 30100, Czech Republic
- 700 1_
- $a Přibáň, Pavel $u Dept. of Computer Science & Engineering, Faculty of Applied Sciences, University of West Bohemia, Univerzitni 8, Plzeň, 30100, Czech Republic; NTIS - New Technologies for the Information Society, Faculty of Applied Sciences, University of West Bohemia, Univerzitni 8, Plzeň, 30100, Czech Republic
- 700 1_
- $a Prantl, Martin $u Dept. of Computer Science & Engineering, Faculty of Applied Sciences, University of West Bohemia, Univerzitni 8, Plzeň, 30100, Czech Republic; NTIS - New Technologies for the Information Society, Faculty of Applied Sciences, University of West Bohemia, Univerzitni 8, Plzeň, 30100, Czech Republic
- 700 1_
- $a Taylor, Stephen Eugene $u Dept. of Computer Science & Engineering, Faculty of Applied Sciences, University of West Bohemia, Univerzitni 8, Plzeň, 30100, Czech Republic; NTIS - New Technologies for the Information Society, Faculty of Applied Sciences, University of West Bohemia, Univerzitni 8, Plzeň, 30100, Czech Republic
- 700 1_
- $a Král, Pavel $u Dept. of Computer Science & Engineering, Faculty of Applied Sciences, University of West Bohemia, Univerzitni 8, Plzeň, 30100, Czech Republic; NTIS - New Technologies for the Information Society, Faculty of Applied Sciences, University of West Bohemia, Univerzitni 8, Plzeň, 30100, Czech Republic
- 700 1_
- $a Kyliš, Jiří $u ICZ Group, Na Hřebenech II 1718/10, Praha, 14000, Czech Republic
- 773 0_
- $w MED00001218 $t Computers in biology and medicine $x 1879-0534 $g Roč. 178 (20240527), s. 108672
- 856 41
- $u https://pubmed.ncbi.nlm.nih.gov/38875906 $y Pubmed
- 910 __
- $a ABA008 $b sig $c sign $y - $z 0
- 990 __
- $a 20241015 $b ABA008
- 991 __
- $a 20241024110749 $b ABA008
- 999 __
- $a ok $b bmc $g 2202017 $s 1231507
- BAS __
- $a 3
- BAS __
- $a PreBMC-MEDLINE
- BMC __
- $a 2024 $b 178 $c - $d 108672 $e 20240527 $i 1879-0534 $m Computers in biology and medicine $n Comput Biol Med $x MED00001218
- LZP __
- $a Pubmed-20241015