Machine learning versus clinicians for detection and classification of oral mucosal lesions
Jazyk angličtina Země Velká Británie, Anglie Médium print-electronic
Typ dokumentu časopisecké články, srovnávací studie
PubMed
40695439
DOI
10.1016/j.jdent.2025.105992
PII: S0300-5712(25)00436-1
Knihovny.cz E-zdroje
- Klíčová slova
- Artificial intelligence, Computer-assisted diagnosis, Deep learning, Early detection of cancer, Mouth neoplasms, Oral potentially malignant disorders (OPMDs), Squamous cell carcinoma,
- MeSH
- deep learning MeSH
- lidé MeSH
- nádory úst * klasifikace diagnóza patologie diagnostické zobrazování MeSH
- nemoci úst * klasifikace diagnóza MeSH
- orální leukoplakie MeSH
- ROC křivka MeSH
- senzitivita a specificita MeSH
- strojové učení * MeSH
- ústní sliznice * patologie diagnostické zobrazování MeSH
- zubní lékaři * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- srovnávací studie MeSH
OBJECTIVES: The detection and classification of oral mucosal lesions is a challenging task due to high heterogeneity and overlap in clinical appearance. Nevertheless, differentiating benign from potentially malignant lesions is essential for appropriate management. This study evaluated whether a deep learning model trained to discriminate 11 classes of oral mucosal lesions could exceed the performance of general dentists. METHODS: 4079 intraoral photographs of benign, potentially malignant and malignant oral lesions were labeled using bounding boxes and classified into 11 classes. The data were split 80:20 for training (n = 3031) and validation (n = 766), keeping an independent test set (n = 282). The YOLOv8 computer vision model was implemented for image classification and object detection. Model performance was evaluated on the test set which was also assessed by six general dentists and three specialists in oral surgery. Evaluation metrics included sensitivity, specificity, F1-score, precision, area under the receiver operating characteristic curve (AUROC), and average precision (AP) at multiple thresholds of intersection over union. RESULTS: In terms of classification, the highest F1-score (0.80) and AUROC (0.96) were observed for human papillomavirus (HPV)-related lesions, whereas the lowest F1-score (0.43) and AUROC (0.78) were obtained for keratosis. In terms of object detection, the best results were achieved for HPV-related lesions (AP25 = 0.82) and proliferative verrucous leukoplakia (AP25 = 0.80; AP50 = 0.76), while the lowest values were noted for leukoplakia (AP25 = 0.36; AP50 = 0.20). Overall, the model performed comparable to specialists (p = 0.93) and significantly better than general dentists (p < 0.01). CONCLUSION: The developed model performed as well as specialists in oral surgery, highlighting its potential as a valuable tool for oral lesion assessment. CLINICAL SIGNIFICANCE: By providing performance comparable to oral surgeons and superior to general dentists, the developed multi-class model could support the clinical evaluation of oral lesions, potentially enabling earlier diagnosis of potentially malignant disorders, enhancing patient management and improving patient prognosis.
Citace poskytuje Crossref.org