• Something wrong with this record ?

Tool-supported Interactive Correction and Semantic Annotation of Narrative Clinical Reports

K. Zvára, M. Tomečková, J. Peleška, V. Svátek, J. Zvárová,

. 2017 ; 56 (3) : 217-229. [pub] 20170428

Language English Country Germany

Document type Journal Article

OBJECTIVES: Our main objective is to design a method of, and supporting software for, interactive correction and semantic annotation of narrative clinical reports, which would allow for their easier and less erroneous processing outside their original context: first, by physicians unfamiliar with the original language (and possibly also the source specialty), and second, by tools requiring structured information, such as decision-support systems. Our additional goal is to gain insights into the process of narrative report creation, including the errors and ambiguities arising therein, and also into the process of report annotation by clinical terms. Finally, we also aim to provide a dataset of ground-truth transformations (specific for Czech as the source language), set up by expert physicians, which can be reused in the future for subsequent analytical studies and for training automated transformation procedures. METHODS: A three-phase preprocessing method has been developed to support secondary use of narrative clinical reports in electronic health record. Narrative clinical reports are narrative texts of healthcare documentation often stored in electronic health records. In the first phase a narrative clinical report is tokenized. In the second phase the tokenized clinical report is normalized. The normalized clinical report is easily readable for health professionals with the knowledge of the language used in the narrative clinical report. In the third phase the normalized clinical report is enriched with extracted structured information. The final result of the third phase is a semi-structured normalized clinical report where the extracted clinical terms are matched to codebook terms. Software tools for interactive correction, expansion and semantic annotation of narrative clinical reports has been developed and the three-phase preprocessing method validated in the cardiology area. RESULTS: The three-phase preprocessing method was validated on 49 anonymous Czech narrative clinical reports in the field of cardiology. Descriptive statistics from the database of accomplished transformations has been calculated. Two cardiologists participated in the annotation phase. The first cardiologist annotated 1500 clinical terms found in 49 narrative clinical reports to codebook terms using the classification systems ICD 10, SNOMED CT, LOINC and LEKY. The second cardiologist validated annotations of the first cardiologist. The correct clinical terms and the codebook terms have been stored in a database. CONCLUSIONS: We extracted structured information from Czech narrative clinical reports by the proposed three-phase preprocessing method and linked it to electronic health records. The software tool, although generic, is tailored for Czech as the specific language of electronic health record pool under study. This will provide a potential etalon for porting this approach to dozens of other less-spoken languages. Structured information can support medical decision making, quality assurance tasks and further medical research.

References provided by Crossref.org

000      
00000naa a2200000 a 4500
001      
bmc18010623
003      
CZ-PrNML
005      
20180426155029.0
007      
ta
008      
180404s2017 gw f 000 0|eng||
009      
AR
024    7_
$a 10.3414/ME16-01-0083 $2 doi
035    __
$a (PubMed)28451691
040    __
$a ABA008 $b cze $d ABA008 $e AACR2
041    0_
$a eng
044    __
$a gw
100    1_
$a Zvára, Karel
245    10
$a Tool-supported Interactive Correction and Semantic Annotation of Narrative Clinical Reports / $c K. Zvára, M. Tomečková, J. Peleška, V. Svátek, J. Zvárová,
520    9_
$a OBJECTIVES: Our main objective is to design a method of, and supporting software for, interactive correction and semantic annotation of narrative clinical reports, which would allow for their easier and less erroneous processing outside their original context: first, by physicians unfamiliar with the original language (and possibly also the source specialty), and second, by tools requiring structured information, such as decision-support systems. Our additional goal is to gain insights into the process of narrative report creation, including the errors and ambiguities arising therein, and also into the process of report annotation by clinical terms. Finally, we also aim to provide a dataset of ground-truth transformations (specific for Czech as the source language), set up by expert physicians, which can be reused in the future for subsequent analytical studies and for training automated transformation procedures. METHODS: A three-phase preprocessing method has been developed to support secondary use of narrative clinical reports in electronic health record. Narrative clinical reports are narrative texts of healthcare documentation often stored in electronic health records. In the first phase a narrative clinical report is tokenized. In the second phase the tokenized clinical report is normalized. The normalized clinical report is easily readable for health professionals with the knowledge of the language used in the narrative clinical report. In the third phase the normalized clinical report is enriched with extracted structured information. The final result of the third phase is a semi-structured normalized clinical report where the extracted clinical terms are matched to codebook terms. Software tools for interactive correction, expansion and semantic annotation of narrative clinical reports has been developed and the three-phase preprocessing method validated in the cardiology area. RESULTS: The three-phase preprocessing method was validated on 49 anonymous Czech narrative clinical reports in the field of cardiology. Descriptive statistics from the database of accomplished transformations has been calculated. Two cardiologists participated in the annotation phase. The first cardiologist annotated 1500 clinical terms found in 49 narrative clinical reports to codebook terms using the classification systems ICD 10, SNOMED CT, LOINC and LEKY. The second cardiologist validated annotations of the first cardiologist. The correct clinical terms and the codebook terms have been stored in a database. CONCLUSIONS: We extracted structured information from Czech narrative clinical reports by the proposed three-phase preprocessing method and linked it to electronic health records. The software tool, although generic, is tailored for Czech as the specific language of electronic health record pool under study. This will provide a potential etalon for porting this approach to dozens of other less-spoken languages. Structured information can support medical decision making, quality assurance tasks and further medical research.
650    _2
$a správnost dat $7 D000068598
650    _2
$a elektronické zdravotní záznamy $x normy $7 D057286
650    _2
$a směrnice jako téma $7 D017408
650    _2
$a mezinárodní klasifikace nemocí $7 D038801
650    12
$a strojové učení $7 D000069550
650    _2
$a smysluplné využití $x normy $7 D062527
650    12
$a zpracování přirozeného jazyka $7 D009323
650    12
$a sémantika $7 D012660
650    _2
$a software $7 D012984
650    _2
$a uživatelské rozhraní počítače $7 D014584
650    12
$a řízený slovník $7 D018875
650    _2
$a zpracování textu $x normy $7 D015443
650    _2
$a psaní $x normy $7 D014956
655    _2
$a časopisecké články $7 D016428
700    1_
$a Tomečková, Marie
700    1_
$a Peleška, Jan
700    1_
$a Svátek, Vojtěch
700    1_
$a Zvárová, Jana $u Prof. Jana Zvárová, Ph.D., DSc., FEFMI, Institute of Hygiene and Epidemiology, 1st Faculty of Medicine, Charles University, Studnickova 7, 128 00 Prague 2, Czech Republic, E-mail: jana.zvarova@lf1.cuni.cz.
773    0_
$w MED00003333 $t Methods of information in medicine $x 2511-705X $g Roč. 56, č. 3 (2017), s. 217-229
856    41
$u https://pubmed.ncbi.nlm.nih.gov/28451691 $y Pubmed
910    __
$a ABA008 $b sig $c sign $y a $z 0
990    __
$a 20180404 $b ABA008
991    __
$a 20180426155140 $b ABA008
999    __
$a ok $b bmc $g 1288108 $s 1007435
BAS    __
$a 3
BAS    __
$a PreBMC
BMC    __
$a 2017 $b 56 $c 3 $d 217-229 $e 20170428 $i 2511-705X $m Methods of information in medicine $n Methods Inf Med $x MED00003333
LZP    __
$a Pubmed-20180404

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...