JavaScript NENÍ povolen !

Prosím povolte JavaScript.

Článek
Článek online

FT
Medvik - BMČ

Je něco špatně v tomto záznamu ?

Inconsistency between Human Observation and Deep Learning Models: Assessing Validity of Postmortem Computed Tomography Diagnosis of Drowning

Y. Zeng, X. Zhang, J. Wang, A. Usui, K. Ichiji, I. Bukovsky, S. Chou, M. Funayama, N. Homma

Zeng, Yuwen
Autor Zeng, Yuwen ORCID Department of Radiological Imaging and Informatics, Tohoku University Graduate School of Medicine, Sendai, Japan. yuwen@tohoku.ac.jp
Zhang, Xiaoyong
Autor Zhang, Xiaoyong National Institute of Technology, Sendai College, Sendai, Japan
Wang, Jiaoyang
Autor Wang, Jiaoyang Department of Intelligent Biomedical System Engineering, Graduate School of Biomedical Engineering, Tohoku University, Sendai, Japan
Usui, Akihito
Autor Usui, Akihito Department of Radiological Imaging and Informatics, Tohoku University Graduate School of Medicine, Sendai, Japan
Ichiji, Kei
Autor Ichiji, Kei Department of Radiological Imaging and Informatics, Tohoku University Graduate School of Medicine, Sendai, Japan
Bukovsky, Ivo
Autor Bukovsky, Ivo Faculty of Science, University of South Bohemia in Ceske Budejovice, Ceske Budejovice, Czech Republic Mechanical Engineering, Czech Technical University in Prague, Prague, Czech Republic
Chou, Shuoyan
Autor Chou, Shuoyan Department of Industrial Management, National Taiwan University of Science and Technology, Taipei, Taiwan
Funayama, Masato
Autor Funayama, Masato Department of Radiological Imaging and Informatics, Tohoku University Graduate School of Medicine, Sendai, Japan
Homma, Noriyasu
Autor Homma, Noriyasu Department of Radiological Imaging and Informatics, Tohoku University Graduate School of Medicine, Sendai, Japan

Journal of imaging informatics in medicine. 2024 ; 37 (3) : 1-10. [pub] 20240209

J Imaging Inform Med
ISSN 2948-2933
Medvik
Zdroj

Jazyk angličtina Země Švýcarsko

Typ dokumentu časopisecké články

Perzistentní odkaz https://www.medvik.cz/link/bmc24013785

Grantová podpora
JP18K19892 Japan Society for the Promotion of Science
JP19H04479 Japan Society for the Promotion of Science
JP20K08012 Japan Society for the Promotion of Science

Online Plný text

NLK PubMed Central od 2024

PubMed 38336949
DOI 10.1007/s10278-024-00974-6
Knihovny.cz E-zdroje

MeSH
deep learning * MeSH
dítě MeSH
dospělí MeSH
lidé středního věku MeSH
lidé MeSH
mladiství MeSH
mladý dospělý MeSH
pitva * metody MeSH
počítačová rentgenová tomografie * metody MeSH
posmrtné zobrazování MeSH
reprodukovatelnost výsledků MeSH
retrospektivní studie MeSH
ROC křivka MeSH
senioři nad 80 let MeSH
senioři MeSH
utonutí * diagnóza MeSH
Check Tag
dítě MeSH
dospělí MeSH
lidé středního věku MeSH
lidé MeSH
mladiství MeSH
mladý dospělý MeSH
mužské pohlaví MeSH
senioři nad 80 let MeSH
senioři MeSH
ženské pohlaví MeSH
Publikační typ
časopisecké články MeSH

Drowning diagnosis is a complicated process in the autopsy, even with the assistance of autopsy imaging and the on-site information from where the body was found. Previous studies have developed well-performed deep learning (DL) models for drowning diagnosis. However, the validity of the DL models was not assessed, raising doubts about whether the learned features accurately represented the medical findings observed by human experts. In this paper, we assessed the medical validity of DL models that had achieved high classification performance for drowning diagnosis. This retrospective study included autopsy cases aged 8-91 years who underwent postmortem computed tomography between 2012 and 2021 (153 drowning and 160 non-drowning cases). We first trained three deep learning models from a previous work and generated saliency maps that highlight important features in the input. To assess the validity of models, pixel-level annotations were created by four radiological technologists and further quantitatively compared with the saliency maps. All the three models demonstrated high classification performance with areas under the receiver operating characteristic curves of 0.94, 0.97, and 0.98, respectively. On the other hand, the assessment results revealed unexpected inconsistency between annotations and models' saliency maps. In fact, each model had, respectively, around 30%, 40%, and 80% of irrelevant areas in the saliency maps, suggesting the predictions of the DL models might be unreliable. The result alerts us in the careful assessment of DL tools, even those with high classification performance.

Department of Industrial Management National Taiwan University of Science and Technology Taipei Taiwan

Department of Intelligent Biomedical System Engineering Graduate School of Biomedical Engineering Tohoku University Sendai Japan

Department of Radiological Imaging and Informatics Tohoku University Graduate School of Medicine Sendai Japan

Faculty of Science University of South Bohemia in Ceske Budejovice Ceske Budejovice Czech Republic

Mechanical Engineering Czech Technical University Prague Prague Czech Republic

National Institute of Technology Sendai College Sendai Japan

Citace poskytuje Crossref.org

000: 00000naa a2200000 a 4500

001: bmc24013785

003: CZ-PrNML

005: 20240905134421.0

007: ta

008: 240725s2024 sz f 000 0|eng||

009: AR

024 7_: $a 10.1007/s10278-024-00974-6 $2 doi

035 __: $a (PubMed)38336949

040 __: $a ABA008 $b cze $d ABA008 $e AACR2

041 0_: $a eng

044 __: $a sz

100 1_: $a Zeng, Yuwen $u Department of Radiological Imaging and Informatics, Tohoku University Graduate School of Medicine, Sendai, Japan. yuwen@tohoku.ac.jp $1 https://orcid.org/0000000336157766

245 10: $a Inconsistency between Human Observation and Deep Learning Models: Assessing Validity of Postmortem Computed Tomography Diagnosis of Drowning / $c Y. Zeng, X. Zhang, J. Wang, A. Usui, K. Ichiji, I. Bukovsky, S. Chou, M. Funayama, N. Homma

520 9_: $a Drowning diagnosis is a complicated process in the autopsy, even with the assistance of autopsy imaging and the on-site information from where the body was found. Previous studies have developed well-performed deep learning (DL) models for drowning diagnosis. However, the validity of the DL models was not assessed, raising doubts about whether the learned features accurately represented the medical findings observed by human experts. In this paper, we assessed the medical validity of DL models that had achieved high classification performance for drowning diagnosis. This retrospective study included autopsy cases aged 8-91 years who underwent postmortem computed tomography between 2012 and 2021 (153 drowning and 160 non-drowning cases). We first trained three deep learning models from a previous work and generated saliency maps that highlight important features in the input. To assess the validity of models, pixel-level annotations were created by four radiological technologists and further quantitatively compared with the saliency maps. All the three models demonstrated high classification performance with areas under the receiver operating characteristic curves of 0.94, 0.97, and 0.98, respectively. On the other hand, the assessment results revealed unexpected inconsistency between annotations and models' saliency maps. In fact, each model had, respectively, around 30%, 40%, and 80% of irrelevant areas in the saliency maps, suggesting the predictions of the DL models might be unreliable. The result alerts us in the careful assessment of DL tools, even those with high classification performance.

650 _2: $a lidé $7 D006801

650 12: $a deep learning $7 D000077321

650 12: $a utonutí $x diagnóza $7 D004332

650 _2: $a senioři $7 D000368

650 _2: $a dítě $7 D002648

650 _2: $a senioři nad 80 let $7 D000369

650 12: $a počítačová rentgenová tomografie $x metody $7 D014057

650 _2: $a mladiství $7 D000293

650 _2: $a dospělí $7 D000328

650 12: $a pitva $x metody $7 D001344

650 _2: $a lidé středního věku $7 D008875

650 _2: $a ženské pohlaví $7 D005260

650 _2: $a retrospektivní studie $7 D012189

650 _2: $a mužské pohlaví $7 D008297

650 _2: $a mladý dospělý $7 D055815

650 _2: $a ROC křivka $7 D012372

650 _2: $a reprodukovatelnost výsledků $7 D015203

650 _2: $a posmrtné zobrazování $7 D000097873

655 _2: $a časopisecké články $7 D016428

700 1_: $a Zhang, Xiaoyong $u National Institute of Technology, Sendai College, Sendai, Japan

700 1_: $a Wang, Jiaoyang $u Department of Intelligent Biomedical System Engineering, Graduate School of Biomedical Engineering, Tohoku University, Sendai, Japan

700 1_: $a Usui, Akihito $u Department of Radiological Imaging and Informatics, Tohoku University Graduate School of Medicine, Sendai, Japan

700 1_: $a Ichiji, Kei $u Department of Radiological Imaging and Informatics, Tohoku University Graduate School of Medicine, Sendai, Japan

700 1_: $a Bukovsky, Ivo $u Faculty of Science, University of South Bohemia in Ceske Budejovice, Ceske Budejovice, Czech Republic $u Mechanical Engineering, Czech Technical University in Prague, Prague, Czech Republic

700 1_: $a Chou, Shuoyan $u Department of Industrial Management, National Taiwan University of Science and Technology, Taipei, Taiwan

700 1_: $a Funayama, Masato $u Department of Radiological Imaging and Informatics, Tohoku University Graduate School of Medicine, Sendai, Japan

700 1_: $a Homma, Noriyasu $u Department of Radiological Imaging and Informatics, Tohoku University Graduate School of Medicine, Sendai, Japan

773 0_: $w MED00215148 $t Journal of imaging informatics in medicine $x 2948-2933 $g Roč. 37, č. 3 (2024), s. 1-10

856 41: $u https://pubmed.ncbi.nlm.nih.gov/38336949 $y Pubmed

910 __: $a ABA008 $b sig $c sign $y - $z 0

990 __: $a 20240725 $b ABA008

991 __: $a 20240905134415 $b ABA008

999 __: $a ok $b bmc $g 2143537 $s 1225651

BAS __: $a 3

BAS __: $a PreBMC-MEDLINE

BMC __: $a 2024 $b 37 $c 3 $d 1-10 $e 20240209 $i 2948-2933 $m Journal of imaging informatics in medicine $n J Imaging Inform Med $x MED00215148

GRA __: $a JP18K19892 $p Japan Society for the Promotion of Science

GRA __: $a JP19H04479 $p Japan Society for the Promotion of Science

GRA __: $a JP20K08012 $p Japan Society for the Promotion of Science

LZP __: $a Pubmed-20240725

Zpět

Najít záznam

v PubMed

Inconsistency between Human Observation and Deep Learning Models: Assessing Validity of Postmortem Computed Tomography Diagnosis of Drowning

Najít záznam

Citační ukazatele

Možnosti archivace