International multicenter validation of AI-driven ultrasound detection of ovarian cancer

. 2025 Jan ; 31 (1) : 189-196. [epub] 20250102

Jazyk angličtina Země Spojené státy americké Médium print-electronic

Typ dokumentu časopisecké články, multicentrická studie, validační studie

Perzistentní odkaz   https://www.medvik.cz/link/pmid39747679

Grantová podpora
231143 Radiumhemmets Forskningsfonder (Cancer Research Foundations of Radiumhemmet)
211657 Pi 01 H Cancerfonden (Swedish Cancer Society)
2020-01702 Vetenskapsrdet (Swedish Research Council)

Odkazy

PubMed 39747679
PubMed Central PMC11750711
DOI 10.1038/s41591-024-03329-4
PII: 10.1038/s41591-024-03329-4
Knihovny.cz E-zdroje

Ovarian lesions are common and often incidentally detected. A critical shortage of expert ultrasound examiners has raised concerns of unnecessary interventions and delayed cancer diagnoses. Deep learning has shown promising results in the detection of ovarian cancer in ultrasound images; however, external validation is lacking. In this international multicenter retrospective study, we developed and validated transformer-based neural network models using a comprehensive dataset of 17,119 ultrasound images from 3,652 patients across 20 centers in eight countries. Using a leave-one-center-out cross-validation scheme, for each center in turn, we trained a model using data from the remaining centers. The models demonstrated robust performance across centers, ultrasound systems, histological diagnoses and patient age groups, significantly outperforming both expert and non-expert examiners on all evaluated metrics, namely F1 score, sensitivity, specificity, accuracy, Cohen's kappa, Matthew's correlation coefficient, diagnostic odds ratio and Youden's J statistic. Furthermore, in a retrospective triage simulation, artificial intelligence (AI)-driven diagnostic support reduced referrals to experts by 63% while significantly surpassing the diagnostic performance of the current practice. These results show that transformer-based models exhibit strong generalization and above human expert-level diagnostic accuracy, with the potential to alleviate the shortage of expert ultrasound examiners and improve patient outcomes.

1st Department of Obstetrics and Gynecology Alexandra Hospital Medical School National and Kapodistrian University of Athens Athens Greece

3rd Faculty of Medicine Charles University Prague Czech Republic

Centro Integrato di Procreazione Medicalmente Assistita e Diagnostica Ostetrico Ginecologica Azienda Ospedaliero Universitaria Policlinico Duilio Casula Monserrato University of Cagliari Cagliari Italy

Department of Clinical Science and Education Södersjukhuset Karolinska Institutet Stockholm Sweden

Department of Gynecological Oncology and Gynecology Medical University of Lublin Lublin Poland

Department of Medicine and Surgery University of Milan Bicocca Milan Italy

Department of Obstetrics and Gynaecology Lithuanian University of Health Sciences Kaunas Lithuania

Department of Obstetrics and Gynecology Clínica Universidad de Navarra Pamplona Spain

Department of Obstetrics and Gynecology Rizal Medical Center Manila Philippines

Department of Obstetrics and Gynecology Skåne University Hospital Lund Sweden

Department of Obstetrics and Gynecology Södersjukhuset Stockholm Sweden

Department of Obstetrics Gynecology and Reproduction Dexeus University Hospital Barcelona Spain

Department of Perinatology and Oncological Gynecology Faculty of Medical Sciences Medical University of Silesia Katowice Poland

Digital Futures KTH Royal Institute of Technology Stockholm Sweden

Fondazione Poliambulanza Istituto Ospedaliero Brescia Italy

Gynecologic and Obstetric Unit Women's and Children's Department Forlì Hospital Forlì Italy

Gynecologic Oncology Centre Department of Gynecology Obstetrics and Neonatology 1st Faculty of Medicine Charles University and General University Hospital Prague Prague Czech Republic

Gynecology and Breast Care Center Mater Olbia Hospital Olbia Italy

Institute for Maternal and Child Health IRCCS 'Burlo Garofolo' Trieste Italy

Institute for the Care of Mother and Child Prague Czech Republic

Obstetrics and Gynecology Unit Forlì and Faenza Hospitals AUSL Romagna Forlì Italy

School of Electrical Engineering and Computer Science KTH Royal Institute of Technology Stockholm Sweden

Science for Life Laboratory Stockholm Sweden

Section of Obstetrics and Gynecology Department of Clinical Sciences Università Politecnica delle Marche Azienda Ospedaliero Universitaria delle Marche Ancona Italy

Unit of Obstetrics and Gynecology Department of Biomedical and Clinical Sciences Luigi Sacco University Hospital University of Milan Milan Italy

Unit of Preventive Gynecology European Institute of Oncology IRCCS Milan Italy

UO Gynecology Fondazione IRCCS San Gerardo dei Tintori Monza Italy

Zobrazit více v PubMed

Yazbek, J. et al. Effect of quality of gynaecological ultrasonography on management of patients with suspected ovarian cancer: a randomised controlled trial. PubMed DOI

Froyman, W. et al. Risk of complications in patients with conservatively managed ovarian tumours (IOTA5): a 2-year interim analysis of a multicentre, prospective, cohort study. PubMed DOI

Vergote, I. et al. Prognostic importance of degree of differentiation and cyst rupture in stage I invasive epithelial ovarian carcinoma. PubMed DOI

Bristow, R. E., Tomacruz, R. S., Armstrong, D. K., Trimble, E. L. & Montz, F. J. Survival effect of maximal cytoreductive surgery for advanced ovarian carcinoma during the platinum era: a meta-analysis. PubMed DOI

Timmerman, D. et al. ESGO/ISUOG/IOTA/ESGE Consensus Statement on pre-operative diagnosis of ovarian tumors. PubMed DOI PMC

Van Holsbeke, C. et al. Ultrasound methods to distinguish between malignant and benign adnexal masses in the hands of examiners with different levels of experience. PubMed DOI

Van Holsbeke, C. et al. Ultrasound experience substantially impacts on diagnostic performance and confidence when adnexal masses are classified using pattern recognition. PubMed DOI

Timmerman, D. et al. Subjective assessment of adnexal masses with the use of ultrasonography: an analysis of interobserver variability and experience. PubMed DOI

Christiansen, F. et al. Ultrasound image analysis using deep neural networks for discriminating between benign and malignant ovarian tumors: comparison with expert subjective assessment. PubMed DOI PMC

Gao, Y. et al. Deep learning-enabled pelvic ultrasound images for accurate diagnosis of ovarian cancer in China: a retrospective, multicentre, diagnostic study. PubMed DOI

Cohen, J. P. et al. Problems in the deployment of machine-learned models in health care. PubMed DOI PMC

Goodfellow, I., Bengio, Y. & Courville, A.

Stacke, K. et al. Measuring domain shift for deep learning in histopathology. PubMed DOI

Sharifzadeh, M., Tehrani, A. K., Benali, H. & Rivaz, H. Ultrasound domain adaptation using frequency domain analysis.

Tierney, J., et al. Accounting for domain shift in neural network ultrasound beamforming.

Liu, X. et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. PubMed DOI

Chalkidou, A. et al. Recommendations for the development and use of imaging test sets to investigate the test performance of artificial intelligence in health screening. PubMed DOI

Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. PubMed DOI PMC

Dosovitskiy, A. et al. An image is worth 16x16 words: transformers for image recognition at scale.

Touvron, H., Cord, M. & Jégou, H. DeiT III: Revenge of the ViT.

Matsoukas, C., Haslum, J. F., Sorkhei, M., Söderberg, M. & Smith, K. What makes transfer learning work for medical images: feature reuse & other factors.

Shamshad, F. et al. Transformers in medical imaging: a survey. PubMed DOI

Van Calster, B. et al. Calibration: The Achilles heel of predictive analytics. PubMed PMC

Van Calster, B. et al. A calibration hierarchy for risk models was defined: from utopia to empirical data. PubMed DOI

Caron, M.,

Redmon, J., Divvala, S., Girshick, R. & Farhadi, A. You only look once: unified, real-time object detection.

Brown, L. D., Cai, T. T. & DasGupta, A. Interval estimation for a binomial proportion. DOI

Minderer, M. et al. Revisiting the calibration of modern neural networks.

Mukhoti, J. et al. Calibrating deep neural networks using focal loss.

Bishop, C. M.

Vaseli, H.,

Selvaraju, R. R.,

Glas, A. S., Lijmer, J. G., Prins, M. H., Bonsel, G. J. & Bossuyt, P. M. The diagnostic odds ratio: a single indicator of test performance. PubMed DOI

Hlatky, M. A. et al. Factors affecting sensitivity and specificity of exercise electrocardiography: multivariable analysis. PubMed DOI

Moons, K. G., van Es, G. A., Deckers, J. W., Habbema, D. J. & Grobbee, D. E. Limitations of sensitivity, specificity, likelihood ratio, and Bayes’ theorem in assessing diagnostic probabilities: a clinical example. PubMed DOI

Koch, A. H. et al. Analysis of computer-aided diagnostics in the preoperative diagnosis of ovarian cancer: a systematic review. PubMed DOI PMC

Van Calster, B., Timmerman, S., Geysels, A., Verbakel, J. Y. & Froyman, W. A deep-learning-enabled diagnosis of ovarian cancer. PubMed DOI

Meys, E. et al. Subjective assessment versus ultrasound models to diagnose ovarian cancer: A systematic review and meta-analysis. PubMed DOI

Reitsma, J. B. et al. Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. PubMed DOI

Van Calster, B. et al. Discrimination between benign and malignant adnexal masses by specialist ultrasound examination versus serum CA-125. PubMed DOI

Deng, J.,

Lin, T. Y., Goyal, P., Girshick, R., He, K. & Dollár, P. Focal loss for dense object detection.

Cubuk, E. D., Zoph, B., Shlens, J. & Le, Q. V. Randaugment: practical automated data augmentation with a reduced search space.

Vaswani, A. et al. Attention is all you need.

Singhal, K. et al. Large language models encode clinical knowledge. PubMed DOI PMC

Gheflati, B. & Rivaz, H. Vision transformers for classification of breast ultrasound images. PubMed

Loshchilov, I. & Hutter, F. Decoupled weight decay regularization.

Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. PubMed DOI PMC

Rey, D. & Neuhäuser, M. Wilcoxon-signed-rank test. In: Lovric M. (ed)

Efron, B. & Hastie, T.

Nejnovějších 20 citací...

Zobrazit více v
Medvik | PubMed

ADNEX-AI: automated extraction of ultrasound predictors for interpretable ovarian cancer risk stratification

. 2025 Dec 11 ; 10 (1) : 18. [epub] 20251211

Najít záznam

Citační ukazatele

Pouze přihlášení uživatelé

Možnosti archivace

Nahrávání dat ...