Enhancing Accuracy in Breast Density Assessment Using Deep Learning: A Multicentric, Multi-Reader Study
Status PubMed-not-MEDLINE Language English Country Switzerland Media electronic
Document type Journal Article
Grant support
X001
Carebot, Ltd.
PubMed
38893643
PubMed Central
PMC11172127
DOI
10.3390/diagnostics14111117
PII: diagnostics14111117
Knihovny.cz E-resources
- Keywords
- BI-RADS, breast density, computer-aided diagnosis, deep learning, full-field digital mammography, medical image processing,
- Publication type
- Journal Article MeSH
The evaluation of mammographic breast density, a critical indicator of breast cancer risk, is traditionally performed by radiologists via visual inspection of mammography images, utilizing the Breast Imaging-Reporting and Data System (BI-RADS) breast density categories. However, this method is subject to substantial interobserver variability, leading to inconsistencies and potential inaccuracies in density assessment and subsequent risk estimations. To address this, we present a deep learning-based automatic detection algorithm (DLAD) designed for the automated evaluation of breast density. Our multicentric, multi-reader study leverages a diverse dataset of 122 full-field digital mammography studies (488 images in CC and MLO projections) sourced from three institutions. We invited two experienced radiologists to conduct a retrospective analysis, establishing a ground truth for 72 mammography studies (BI-RADS class A: 18, BI-RADS class B: 43, BI-RADS class C: 7, BI-RADS class D: 4). The efficacy of the DLAD was then compared to the performance of five independent radiologists with varying levels of experience. The DLAD showed robust performance, achieving an accuracy of 0.819 (95% CI: 0.736-0.903), along with an F1 score of 0.798 (0.594-0.905), precision of 0.806 (0.596-0.896), recall of 0.830 (0.650-0.946), and a Cohen's Kappa (κ) of 0.708 (0.562-0.841). The algorithm achieved robust performance that matches and in four cases exceeds that of individual radiologists. The statistical analysis did not reveal a significant difference in accuracy between DLAD and the radiologists, underscoring the model's competitive diagnostic alignment with professional radiologist assessments. These results demonstrate that the deep learning-based automatic detection algorithm can enhance the accuracy and consistency of breast density assessments, offering a reliable tool for improving breast cancer screening outcomes.
Carebot Ltd 128 00 Prague Czech Republic
Department of Radiology Masaryk Memorial Cancer Institute 602 00 Brno Czech Republic
Department of Simulation Medicine Faculty of Medicine Masaryk University 625 00 Brno Czech Republic
See more in PubMed
Broeders M., Moss S., Nyström L., Njor S., Jonsson H., Paap E., Massat N., Duffy S., Lynge E., Paci E. The impact of mammographic screening on breast cancer mortality in Europe: A review of observational studies. J. Med. Screen. 2012;19:14–25. doi: 10.1258/jms.2012.012078. PubMed DOI
De Gelder R., Fracheboud J., Heijnsdijk E., Heeten G., Verbeek A., Broeders M., Draisma G., De Koning H. Digital mammography screening: Weighing reduced mortality against increased overdiagnosis. Prev. Med. 2011;53:134–140. doi: 10.1016/j.ypmed.2011.06.009. PubMed DOI
Boyd N., Guo H., Martin L., Sun L., Stone J., Fishell E., Jong R., Hislop G., Chiarelli A., Minkin S. Others Mammographic density and the risk and detection of breast cancer. N. Engl. J. Med. 2007;356:227–236. doi: 10.1056/NEJMoa062790. PubMed DOI
Ellenbogen P. BI-RADS: Revised and replicated. J. Am. Coll. Radiol. 2014;11:2. doi: 10.1016/j.jacr.2013.11.010. PubMed DOI
Gweon H., Youk J., Kim J., Son E. Radiologist assessment of breast density by BI-RADS categories versus fully automated volumetric assessment. AJR Am. J. Roentgenol. 2013;201:692–697. doi: 10.2214/AJR.12.10197. PubMed DOI
Bernardi D., Pellegrini M., Michele S., Tuttobene P., Fantò C., Valentini M., Gentilini M., Ciatto S. Interobserver agreement in breast radiological density attribution according to BI-RADS quantitative classification. Radiol. Med. 2012;117:519–528. doi: 10.1007/s11547-011-0777-3. PubMed DOI
Portnow L., Choridah L., Kardinah K., Handarini T., Pijnappel R., Bluekens A., Duijm L., Schoub P., Smilg P., Malek L., et al. International Interobserver Variability of Breast Density Assessment. J. Am. Coll. Radiol. 2023;20:671–684. doi: 10.1016/j.jacr.2023.03.010. PubMed DOI
Koch H., Larsen M., Bartsch H., Kurz K., Hofvind S. Artificial intelligence in BreastScreen Norway: A retrospective analysis of a cancer-enriched sample including 1254 breast cancer cases. Eur. Radiol. 2023;33:3735–3743. doi: 10.1007/s00330-023-09461-y. PubMed DOI PMC
Zhu X., Wolfgruber T., Leong L., Jensen M., Scott C., Winham S., Sadowski P., Vachon C., Kerlikowske K., Shepherd J. Deep learning predicts interval and screening-detected cancer from screening mammograms: A case-case-control study in 6369 women. Radiology. 2021;301:550–558. doi: 10.1148/radiol.2021203758. PubMed DOI PMC
Gastounioti A., Eriksson M., Cohen E., Mankowski W., Pantalone L., Ehsan S., McCarthy A., Kontos D., Hall P., Conant E. External Validation of a Mammography-Derived AI-Based Risk Model in a US Breast Cancer Screening Cohort of White and Black Women. Cancers. 2022;14:4803. doi: 10.3390/cancers14194803. PubMed DOI PMC
Leeuwen K., Rooij M., Schalekamp S., Ginneken B., Rutten M. How does artificial intelligence in radiology improve efficiency and health outcomes? Pediatr. Radiol. 2021;52:2087–2093. doi: 10.1007/s00247-021-05114-8. PubMed DOI PMC
Redondo A., Comas M., Macia F., Ferrer F., Murta-Nascimento C., Maristany M., Molins E., Sala M., Castells X. Inter-and intraradiologist variability in the BI-RADS assessment and breast density categories for screening mammograms. Br. J. Radiol. 2012;85:1465–1470. doi: 10.1259/bjr/21256379. PubMed DOI PMC
Kallenberg M., Petersen K., Nielsen M., Ng A., Diao P., Igel C., Vachon C., Holland K., Winkel R., Karssemeijer N. Others Unsupervised deep learning applied to breast density segmentation and mammographic risk scoring. IEEE Trans. Med. Imaging. 2016;35:1322–1331. doi: 10.1109/TMI.2016.2532122. PubMed DOI
Mohamed A., Berg W., Peng H., Luo Y., Jankowitz R., Wu S. A deep learning method for classifying mammographic breast density categories. Med. Phys. 2018;45:314–321. doi: 10.1002/mp.12683. PubMed DOI PMC
Becker A., Marcon M., Ghafoor S., Wurnig M., Frauenfelder T., Boss A. Deep learning in mammography: Diagnostic accuracy of a multipurpose image analysis software in the detection of breast cancer. Investig. Radiol. 2017;52:434–440. doi: 10.1097/RLI.0000000000000358. PubMed DOI
Li C., Xu J., Liu Q., Zhou Y., Mou L., Pu Z., Xia Y., Zheng H., Wang S. Multi-view mammographic density classification by dilated and attention-guided residual learning. IEEE/ACM Trans. Comput. Biol. Bioinform. 2020;18:1003–1013. doi: 10.1109/TCBB.2020.2970713. PubMed DOI
Deng J., Ma Y., Li D., Zhao J., Liu Y., Zhang H. Classification of breast density categories based on SE-Attention neural networks. Comput. Methods Programs Biomed. 2020;193:105489. doi: 10.1016/j.cmpb.2020.105489. PubMed DOI
Wu N., Geras K., Shen Y., Su J., Kim S., Kim E., Wolfson S., Moy L., Cho K. Breast density classification with deep convolutional neural networks; Proceedings of the 2018 IEEE International Conference On Acoustics, Speech And Signal Processing (ICASSP); Calgary, AB, Canada. 15–20 April 2018; pp. 6682–6686.
Sergeant J., Walshaw L., Wilson M., Seed S., Barr N., Beetles U., Boggis C., Bundred S., Gadde S., Lim Y. Others Same task, same observers, different values: The problem with visual assessment of breast density; Proceedings of the Medical Imaging 2013: Image Perception, Observer Performance, and Technology Assessment; Lake Buena Vista, FL, USA. 9–14 February 2013; pp. 197–204.
Alomaim W., O’Leary D., Ryan J., Rainford L., Evanoff M., Foley S. Variability of breast density classification between US and UK radiologists. J. Med. Imaging Radiat. Sci. 2019;50:53–61. doi: 10.1016/j.jmir.2018.11.002. PubMed DOI
Alomaim W., O’Leary D., Ryan J., Rainford L., Evanoff M., Foley S. Subjective versus quantitative methods of assessing breast density. Diagnostics. 2020;10:331. doi: 10.3390/diagnostics10050331. PubMed DOI PMC
Wortsman M., Ilharco G., Gadre S., Roelofs R., Gontijo-Lopes R., Morcos A., Namkoong H., Farhadi A., Carmon Y., Kornblith S. Others Model soups: Averaging weights of multiple fine-tuned models improves accuracy without increasing inference time; Proceedings of the International Conference On Machine Learning; Baltimore, MD, USA. 17–23 July 2022; pp. 23965–23998.
Tan M., Le Q. Efficientnet: Rethinking model scaling for convolutional neural networks; Proceedings of the International Conference On Machine Learning; Long Beach, CA, USA. 10–15 June 2019; pp. 6105–6114.
Dansereau C., Sobral M., Bhogal M., Zalai M. Model soups to increase inference without increasing compute time. arXiv. 20232301.10092
McHugh M. Interrater reliability: The kappa statistic. Biochem. Medica. 2012;22:276–282. doi: 10.11613/BM.2012.031. PubMed DOI PMC
Sprague B., Gangnon R., Burt V., Trentham-Dietz A., Hampton J., Wellman R., Kerlikowske K., Miglioretti D. Prevalence of mammographically dense breasts in the United States. J. Natl. Cancer Inst. 2014;106:dju255. doi: 10.1093/jnci/dju255. PubMed DOI PMC
Advani S., Zhu W., Demb J., Sprague B., Onega T., Henderson L., Buist D., Zhang D., Schousboe J., Walter L. Others Association of breast density with breast cancer risk among women aged 65 years or older by age group and body mass index. JAMA Netw. Open. 2021;4:e2122810. doi: 10.1001/jamanetworkopen.2021.22810. PubMed DOI PMC