• This record comes from PubMed

Enhancing Accuracy in Breast Density Assessment Using Deep Learning: A Multicentric, Multi-Reader Study

. 2024 May 28 ; 14 (11) : . [epub] 20240528

Status PubMed-not-MEDLINE Language English Country Switzerland Media electronic

Document type Journal Article

Grant support
X001 Carebot, Ltd.

Links

PubMed 38893643
PubMed Central PMC11172127
DOI 10.3390/diagnostics14111117
PII: diagnostics14111117
Knihovny.cz E-resources

The evaluation of mammographic breast density, a critical indicator of breast cancer risk, is traditionally performed by radiologists via visual inspection of mammography images, utilizing the Breast Imaging-Reporting and Data System (BI-RADS) breast density categories. However, this method is subject to substantial interobserver variability, leading to inconsistencies and potential inaccuracies in density assessment and subsequent risk estimations. To address this, we present a deep learning-based automatic detection algorithm (DLAD) designed for the automated evaluation of breast density. Our multicentric, multi-reader study leverages a diverse dataset of 122 full-field digital mammography studies (488 images in CC and MLO projections) sourced from three institutions. We invited two experienced radiologists to conduct a retrospective analysis, establishing a ground truth for 72 mammography studies (BI-RADS class A: 18, BI-RADS class B: 43, BI-RADS class C: 7, BI-RADS class D: 4). The efficacy of the DLAD was then compared to the performance of five independent radiologists with varying levels of experience. The DLAD showed robust performance, achieving an accuracy of 0.819 (95% CI: 0.736-0.903), along with an F1 score of 0.798 (0.594-0.905), precision of 0.806 (0.596-0.896), recall of 0.830 (0.650-0.946), and a Cohen's Kappa (κ) of 0.708 (0.562-0.841). The algorithm achieved robust performance that matches and in four cases exceeds that of individual radiologists. The statistical analysis did not reveal a significant difference in accuracy between DLAD and the radiologists, underscoring the model's competitive diagnostic alignment with professional radiologist assessments. These results demonstrate that the deep learning-based automatic detection algorithm can enhance the accuracy and consistency of breast density assessments, offering a reliable tool for improving breast cancer screening outcomes.

See more in PubMed

Broeders M., Moss S., Nyström L., Njor S., Jonsson H., Paap E., Massat N., Duffy S., Lynge E., Paci E. The impact of mammographic screening on breast cancer mortality in Europe: A review of observational studies. J. Med. Screen. 2012;19:14–25. doi: 10.1258/jms.2012.012078. PubMed DOI

De Gelder R., Fracheboud J., Heijnsdijk E., Heeten G., Verbeek A., Broeders M., Draisma G., De Koning H. Digital mammography screening: Weighing reduced mortality against increased overdiagnosis. Prev. Med. 2011;53:134–140. doi: 10.1016/j.ypmed.2011.06.009. PubMed DOI

Boyd N., Guo H., Martin L., Sun L., Stone J., Fishell E., Jong R., Hislop G., Chiarelli A., Minkin S. Others Mammographic density and the risk and detection of breast cancer. N. Engl. J. Med. 2007;356:227–236. doi: 10.1056/NEJMoa062790. PubMed DOI

Ellenbogen P. BI-RADS: Revised and replicated. J. Am. Coll. Radiol. 2014;11:2. doi: 10.1016/j.jacr.2013.11.010. PubMed DOI

Gweon H., Youk J., Kim J., Son E. Radiologist assessment of breast density by BI-RADS categories versus fully automated volumetric assessment. AJR Am. J. Roentgenol. 2013;201:692–697. doi: 10.2214/AJR.12.10197. PubMed DOI

Bernardi D., Pellegrini M., Michele S., Tuttobene P., Fantò C., Valentini M., Gentilini M., Ciatto S. Interobserver agreement in breast radiological density attribution according to BI-RADS quantitative classification. Radiol. Med. 2012;117:519–528. doi: 10.1007/s11547-011-0777-3. PubMed DOI

Portnow L., Choridah L., Kardinah K., Handarini T., Pijnappel R., Bluekens A., Duijm L., Schoub P., Smilg P., Malek L., et al. International Interobserver Variability of Breast Density Assessment. J. Am. Coll. Radiol. 2023;20:671–684. doi: 10.1016/j.jacr.2023.03.010. PubMed DOI

Koch H., Larsen M., Bartsch H., Kurz K., Hofvind S. Artificial intelligence in BreastScreen Norway: A retrospective analysis of a cancer-enriched sample including 1254 breast cancer cases. Eur. Radiol. 2023;33:3735–3743. doi: 10.1007/s00330-023-09461-y. PubMed DOI PMC

Zhu X., Wolfgruber T., Leong L., Jensen M., Scott C., Winham S., Sadowski P., Vachon C., Kerlikowske K., Shepherd J. Deep learning predicts interval and screening-detected cancer from screening mammograms: A case-case-control study in 6369 women. Radiology. 2021;301:550–558. doi: 10.1148/radiol.2021203758. PubMed DOI PMC

Gastounioti A., Eriksson M., Cohen E., Mankowski W., Pantalone L., Ehsan S., McCarthy A., Kontos D., Hall P., Conant E. External Validation of a Mammography-Derived AI-Based Risk Model in a US Breast Cancer Screening Cohort of White and Black Women. Cancers. 2022;14:4803. doi: 10.3390/cancers14194803. PubMed DOI PMC

Leeuwen K., Rooij M., Schalekamp S., Ginneken B., Rutten M. How does artificial intelligence in radiology improve efficiency and health outcomes? Pediatr. Radiol. 2021;52:2087–2093. doi: 10.1007/s00247-021-05114-8. PubMed DOI PMC

Redondo A., Comas M., Macia F., Ferrer F., Murta-Nascimento C., Maristany M., Molins E., Sala M., Castells X. Inter-and intraradiologist variability in the BI-RADS assessment and breast density categories for screening mammograms. Br. J. Radiol. 2012;85:1465–1470. doi: 10.1259/bjr/21256379. PubMed DOI PMC

Kallenberg M., Petersen K., Nielsen M., Ng A., Diao P., Igel C., Vachon C., Holland K., Winkel R., Karssemeijer N. Others Unsupervised deep learning applied to breast density segmentation and mammographic risk scoring. IEEE Trans. Med. Imaging. 2016;35:1322–1331. doi: 10.1109/TMI.2016.2532122. PubMed DOI

Mohamed A., Berg W., Peng H., Luo Y., Jankowitz R., Wu S. A deep learning method for classifying mammographic breast density categories. Med. Phys. 2018;45:314–321. doi: 10.1002/mp.12683. PubMed DOI PMC

Becker A., Marcon M., Ghafoor S., Wurnig M., Frauenfelder T., Boss A. Deep learning in mammography: Diagnostic accuracy of a multipurpose image analysis software in the detection of breast cancer. Investig. Radiol. 2017;52:434–440. doi: 10.1097/RLI.0000000000000358. PubMed DOI

Li C., Xu J., Liu Q., Zhou Y., Mou L., Pu Z., Xia Y., Zheng H., Wang S. Multi-view mammographic density classification by dilated and attention-guided residual learning. IEEE/ACM Trans. Comput. Biol. Bioinform. 2020;18:1003–1013. doi: 10.1109/TCBB.2020.2970713. PubMed DOI

Deng J., Ma Y., Li D., Zhao J., Liu Y., Zhang H. Classification of breast density categories based on SE-Attention neural networks. Comput. Methods Programs Biomed. 2020;193:105489. doi: 10.1016/j.cmpb.2020.105489. PubMed DOI

Wu N., Geras K., Shen Y., Su J., Kim S., Kim E., Wolfson S., Moy L., Cho K. Breast density classification with deep convolutional neural networks; Proceedings of the 2018 IEEE International Conference On Acoustics, Speech And Signal Processing (ICASSP); Calgary, AB, Canada. 15–20 April 2018; pp. 6682–6686.

Sergeant J., Walshaw L., Wilson M., Seed S., Barr N., Beetles U., Boggis C., Bundred S., Gadde S., Lim Y. Others Same task, same observers, different values: The problem with visual assessment of breast density; Proceedings of the Medical Imaging 2013: Image Perception, Observer Performance, and Technology Assessment; Lake Buena Vista, FL, USA. 9–14 February 2013; pp. 197–204.

Alomaim W., O’Leary D., Ryan J., Rainford L., Evanoff M., Foley S. Variability of breast density classification between US and UK radiologists. J. Med. Imaging Radiat. Sci. 2019;50:53–61. doi: 10.1016/j.jmir.2018.11.002. PubMed DOI

Alomaim W., O’Leary D., Ryan J., Rainford L., Evanoff M., Foley S. Subjective versus quantitative methods of assessing breast density. Diagnostics. 2020;10:331. doi: 10.3390/diagnostics10050331. PubMed DOI PMC

Wortsman M., Ilharco G., Gadre S., Roelofs R., Gontijo-Lopes R., Morcos A., Namkoong H., Farhadi A., Carmon Y., Kornblith S. Others Model soups: Averaging weights of multiple fine-tuned models improves accuracy without increasing inference time; Proceedings of the International Conference On Machine Learning; Baltimore, MD, USA. 17–23 July 2022; pp. 23965–23998.

Tan M., Le Q. Efficientnet: Rethinking model scaling for convolutional neural networks; Proceedings of the International Conference On Machine Learning; Long Beach, CA, USA. 10–15 June 2019; pp. 6105–6114.

Dansereau C., Sobral M., Bhogal M., Zalai M. Model soups to increase inference without increasing compute time. arXiv. 20232301.10092

McHugh M. Interrater reliability: The kappa statistic. Biochem. Medica. 2012;22:276–282. doi: 10.11613/BM.2012.031. PubMed DOI PMC

Sprague B., Gangnon R., Burt V., Trentham-Dietz A., Hampton J., Wellman R., Kerlikowske K., Miglioretti D. Prevalence of mammographically dense breasts in the United States. J. Natl. Cancer Inst. 2014;106:dju255. doi: 10.1093/jnci/dju255. PubMed DOI PMC

Advani S., Zhu W., Demb J., Sprague B., Onega T., Henderson L., Buist D., Zhang D., Schousboe J., Walter L. Others Association of breast density with breast cancer risk among women aged 65 years or older by age group and body mass index. JAMA Netw. Open. 2021;4:e2122810. doi: 10.1001/jamanetworkopen.2021.22810. PubMed DOI PMC

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...