• Something wrong with this record ?

The reliability of a deep learning model in clinical out-of-distribution MRI data: A multicohort study

G. Mårtensson, D. Ferreira, T. Granberg, L. Cavallin, K. Oppedal, A. Padovani, I. Rektorova, L. Bonanni, M. Pardini, MG. Kramberger, JP. Taylor, J. Hort, J. Snædal, J. Kulisevsky, F. Blanc, A. Antonini, P. Mecocci, B. Vellas, M. Tsolaki, I....

. 2020 ; 66 (-) : 101714. [pub] 20200501

Language English Country Netherlands

Document type Journal Article, Research Support, N.I.H., Extramural, Research Support, Non-U.S. Gov't

Grant support
U01 AG024904 NIA NIH HHS - United States
W81XWH-12-2-0012 Department of Defense - International
U01 AG024904 NIA NIH HHS - United States

Deep learning (DL) methods have in recent years yielded impressive results in medical imaging, with the potential to function as clinical aid to radiologists. However, DL models in medical imaging are often trained on public research cohorts with images acquired with a single scanner or with strict protocol harmonization, which is not representative of a clinical setting. The aim of this study was to investigate how well a DL model performs in unseen clinical datasets-collected with different scanners, protocols and disease populations-and whether more heterogeneous training data improves generalization. In total, 3117 MRI scans of brains from multiple dementia research cohorts and memory clinics, that had been visually rated by a neuroradiologist according to Scheltens' scale of medial temporal atrophy (MTA), were included in this study. By training multiple versions of a convolutional neural network on different subsets of this data to predict MTA ratings, we assessed the impact of including images from a wider distribution during training had on performance in external memory clinic data. Our results showed that our model generalized well to datasets acquired with similar protocols as the training data, but substantially worse in clinical cohorts with visibly different tissue contrasts in the images. This implies that future DL studies investigating performance in out-of-distribution (OOD) MRI data need to assess multiple external cohorts for reliable results. Further, by including data from a wider range of scanners and protocols the performance improved in OOD data, which suggests that more heterogeneous training data makes the model generalize better. To conclude, this is the most comprehensive study to date investigating the domain shift in deep learning on MRI data, and we advocate rigorous evaluation of DL models on clinical data prior to being certified for deployment.

1st Department of Neurology Medical Faculty St Anne's Hospital and CEITEC Masaryk University Brno Czech Republic

3rd Department of Neurology Memory and Dementia Unit Aristotle University of Thessaloniki Thessaloniki Greece

Centre for Age Related Medicine Stavanger University Hospital Stavanger Norway

Centro de Investigación en Red Enfermedades Neurodegenerativas Barcelona Spain

Day Hospital of Geriatrics Memory Resource and Research Centre of Strasbourg Department of Geriatrics Hôpitaux Universitaires de Strasbourg Strasbourg France

Department of Clinical Neuroscience Karolinska Institutet Stockholm Sweden

Department of Electrical Engineering and Computer Science University of Stavanger Stavanger Norway

Department of Neuroimaging Centre for Neuroimaging Sciences Institute of Psychiatry Psychology and Neuroscience King's College London London UK

Department of Neurology University Medical Centre Ljubljana Medical faculty University of Ljubljana Slovenia

Department of Neuroscience Imaging and Clinical Sciences and CESI University G d'Annunzio of Chieti Pescara Chieti Italy

Department of Neuroscience University of Genoa and Neurology Clinics Polyclinic San Martino Hospital Genoa Italy

Department of Neuroscience University of Padua Padua and Fondazione Ospedale San Camillo Venezia Venice Italy

Department of Psychiatry Warneford Hospital University of Oxford Oxford UK

Department of Radiology Karolinska University Hospital Stockholm Sweden

Division of Clinical Geriatrics Department of Neurobiology Care Sciences and Society Karolinska Institutet Stockholm Sweden

Institut d'Investigacions Biomédiques Sant Pau Barcelona Spain

Institute of Clinical Medicine Neurology University of Eastern Finland Finland

Institute of Gerontology and Geriatrics University of Perugia Perugia Italy

Institute of Neuroscience Newcastle University Newcastle upon Tyne UK

Institute of Psychiatry Psychology and Neuroscience King's College London London UK

Landspitali University Hospital Reykjavik Iceland

Medical University of Lodz Lodz Poland

Memory Clinic Department of Neurology Charles University 2nd Faculty of Medicine and Motol University Hospital Prague Czech Republic

Movement Disorders Unit Neurology Department Sant Pau Hospital Barcelona Spain

Neurocenter Neurology Kuopio University Hospital Kuopio Finland

Neurology Unit Department of Clinical and Experimental Sciences University of Brescia Brescia Italy

NIHR Biomedical Research Centre for Mental Health London UK

NIHR Biomedical Research Unit for Dementia London UK

Stavanger Medical Imaging Laboratory Department of Radiology Stavanger University Hospital Stavanger Norway

UMR INSERM 1027 gerontopole CHU University of Toulouse France

Universitat Autónoma de Barcelona Barcelona Spain

University of Strasbourg and French National Centre for Scientific Research ICONE Strasbourg France

References provided by Crossref.org

000      
00000naa a2200000 a 4500
001      
bmc21019756
003      
CZ-PrNML
005      
20210830101344.0
007      
ta
008      
210728s2020 ne f 000 0|eng||
009      
AR
024    7_
$a 10.1016/j.media.2020.101714 $2 doi
035    __
$a (PubMed)33007638
040    __
$a ABA008 $b cze $d ABA008 $e AACR2
041    0_
$a eng
044    __
$a ne
100    1_
$a Mårtensson, Gustav $u Division of Clinical Geriatrics, Department of Neurobiology, Care Sciences and Society, Karolinska Institutet, Stockholm, Sweden. Electronic address: gustav.martensson@ki.se
245    14
$a The reliability of a deep learning model in clinical out-of-distribution MRI data: A multicohort study / $c G. Mårtensson, D. Ferreira, T. Granberg, L. Cavallin, K. Oppedal, A. Padovani, I. Rektorova, L. Bonanni, M. Pardini, MG. Kramberger, JP. Taylor, J. Hort, J. Snædal, J. Kulisevsky, F. Blanc, A. Antonini, P. Mecocci, B. Vellas, M. Tsolaki, I. Kłoszewska, H. Soininen, S. Lovestone, A. Simmons, D. Aarsland, E. Westman
520    9_
$a Deep learning (DL) methods have in recent years yielded impressive results in medical imaging, with the potential to function as clinical aid to radiologists. However, DL models in medical imaging are often trained on public research cohorts with images acquired with a single scanner or with strict protocol harmonization, which is not representative of a clinical setting. The aim of this study was to investigate how well a DL model performs in unseen clinical datasets-collected with different scanners, protocols and disease populations-and whether more heterogeneous training data improves generalization. In total, 3117 MRI scans of brains from multiple dementia research cohorts and memory clinics, that had been visually rated by a neuroradiologist according to Scheltens' scale of medial temporal atrophy (MTA), were included in this study. By training multiple versions of a convolutional neural network on different subsets of this data to predict MTA ratings, we assessed the impact of including images from a wider distribution during training had on performance in external memory clinic data. Our results showed that our model generalized well to datasets acquired with similar protocols as the training data, but substantially worse in clinical cohorts with visibly different tissue contrasts in the images. This implies that future DL studies investigating performance in out-of-distribution (OOD) MRI data need to assess multiple external cohorts for reliable results. Further, by including data from a wider range of scanners and protocols the performance improved in OOD data, which suggests that more heterogeneous training data makes the model generalize better. To conclude, this is the most comprehensive study to date investigating the domain shift in deep learning on MRI data, and we advocate rigorous evaluation of DL models on clinical data prior to being certified for deployment.
650    _2
$a mozek $x diagnostické zobrazování $7 D001921
650    12
$a deep learning $7 D000077321
650    _2
$a lidé $7 D006801
650    _2
$a magnetická rezonanční tomografie $7 D008279
650    _2
$a neuronové sítě $7 D016571
650    _2
$a reprodukovatelnost výsledků $7 D015203
655    _2
$a časopisecké články $7 D016428
655    _2
$a Research Support, N.I.H., Extramural $7 D052061
655    _2
$a práce podpořená grantem $7 D013485
700    1_
$a Ferreira, Daniel $u Division of Clinical Geriatrics, Department of Neurobiology, Care Sciences and Society, Karolinska Institutet, Stockholm, Sweden
700    1_
$a Granberg, Tobias $u Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden; Department of Radiology, Karolinska University Hospital, Stockholm, Sweden
700    1_
$a Cavallin, Lena $u Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden; Department of Radiology, Karolinska University Hospital, Stockholm, Sweden
700    1_
$a Oppedal, Ketil $u Centre for Age-Related Medicine, Stavanger University Hospital, Stavanger, Norway; Stavanger Medical Imaging Laboratory (SMIL), Department of Radiology, Stavanger University Hospital, Stavanger, Norway; Department of Electrical Engineering and Computer Science, University of Stavanger, Stavanger, Norway
700    1_
$a Padovani, Alessandro $u Neurology Unit, Department of Clinical and Experimental Sciences, University of Brescia, Brescia, Italy
700    1_
$a Rektorova, Irena $u 1st Department of Neurology, Medical Faculty, St. Anne's Hospital and CEITEC, Masaryk University, Brno, Czech Republic
700    1_
$a Bonanni, Laura $u Department of Neuroscience Imaging and Clinical Sciences and CESI, University G d'Annunzio of Chieti-Pescara, Chieti, Italy
700    1_
$a Pardini, Matteo $u Department of Neuroscience (DINOGMI), University of Genoa and Neurology Clinics, Polyclinic San Martino Hospital, Genoa, Italy
700    1_
$a Kramberger, Milica G $u Department of Neurology, University Medical Centre Ljubljana, Medical faculty, University of Ljubljana, Slovenia
700    1_
$a Taylor, John-Paul $u Institute of Neuroscience, Newcastle University, Newcastle upon Tyne, UK
700    1_
$a Hort, Jakub $u Memory Clinic, Department of Neurology, Charles University, 2nd Faculty of Medicine and Motol University Hospital, Prague, Czech Republic
700    1_
$a Snædal, Jón $u Landspitali University Hospital, Reykjavik, Iceland
700    1_
$a Kulisevsky, Jaime $u Movement Disorders Unit, Neurology Department, Sant Pau Hospital, Barcelona, Spain; Institut d'Investigacions Biomédiques Sant Pau (IIB-Sant Pau), Barcelona, Spain; Centro de Investigación en Red-Enfermedades Neurodegenerativas (CIBERNED), Barcelona, Spain; Universitat Autónoma de Barcelona (U.A.B.), Barcelona, Spain
700    1_
$a Blanc, Frederic $u Day Hospital of Geriatrics, Memory Resource and Research Centre (CM2R) of Strasbourg, Department of Geriatrics, Hôpitaux Universitaires de Strasbourg, Strasbourg, France; University of Strasbourg and French National Centre for Scientific Research (CNRS), ICube Laboratory and Fédération de Médecine Translationnelle de Strasbourg (FMTS), Team Imagerie Multimodale Intégrative en Santé (IMIS)/ICONE, Strasbourg, France
700    1_
$a Antonini, Angelo $u Department of Neuroscience, University of Padua, Padua & Fondazione Ospedale San Camillo, Venezia, Venice, Italy
700    1_
$a Mecocci, Patrizia $u Institute of Gerontology and Geriatrics, University of Perugia, Perugia, Italy
700    1_
$a Vellas, Bruno $u UMR INSERM 1027, gerontopole, CHU, University of Toulouse, France
700    1_
$a Tsolaki, Magda $u 3rd Department of Neurology, Memory and Dementia Unit, Aristotle University of Thessaloniki, Thessaloniki, Greece
700    1_
$a Kłoszewska, Iwona $u Medical University of Lodz, Lodz, Poland
700    1_
$a Soininen, Hilkka $u Institute of Clinical Medicine, Neurology, University of Eastern Finland, Finland; Neurocenter, Neurology, Kuopio University Hospital, Kuopio, Finland
700    1_
$a Lovestone, Simon $u Department of Psychiatry, Warneford Hospital, University of Oxford, Oxford, UK
700    1_
$a Simmons, Andrew $u NIHR Biomedical Research Centre for Mental Health, London, UK; NIHR Biomedical Research Unit for Dementia, London, UK; Department of Neuroimaging, Centre for Neuroimaging Sciences, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
700    1_
$a Aarsland, Dag $u Centre for Age-Related Medicine, Stavanger University Hospital, Stavanger, Norway; Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
700    1_
$a Westman, Eric $u Division of Clinical Geriatrics, Department of Neurobiology, Care Sciences and Society, Karolinska Institutet, Stockholm, Sweden; Department of Neuroimaging, Centre for Neuroimaging Sciences, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK
773    0_
$w MED00007107 $t Medical image analysis $x 1361-8423 $g Roč. 66, č. - (2020), s. 101714
856    41
$u https://pubmed.ncbi.nlm.nih.gov/33007638 $y Pubmed
910    __
$a ABA008 $b sig $c sign $y p $z 0
990    __
$a 20210728 $b ABA008
991    __
$a 20210830101344 $b ABA008
999    __
$a ok $b bmc $g 1690546 $s 1140202
BAS    __
$a 3
BAS    __
$a PreBMC
BMC    __
$a 2020 $b 66 $c - $d 101714 $e 20200501 $i 1361-8423 $m Medical image analysis $n Med Image Anal $x MED00007107
GRA    __
$a U01 AG024904 $p NIA NIH HHS $2 United States
GRA    __
$a W81XWH-12-2-0012 $p Department of Defense $2 International
GRA    __
$a U01 AG024904 $p NIA NIH HHS $2 United States
LZP    __
$a Pubmed-20210728

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...