INTRODUCTION: Interpretation of the 12‑lead Electrocardiogram (ECG) is normally assisted with an automated diagnosis (AD), which can facilitate an 'automation bias' where interpreters can be anchored. In this paper, we studied, 1) the effect of an incorrect AD on interpretation accuracy and interpreter confidence (a proxy for uncertainty), and 2) whether confidence and other interpreter features can predict interpretation accuracy using machine learning. METHODS: This study analysed 9000 ECG interpretations from cardiology and non-cardiology fellows (CFs and non-CFs). One third of the ECGs involved no ADs, one third with ADs (half as incorrect) and one third had multiple ADs. Interpretations were scored and interpreter confidence was recorded for each interpretation and subsequently standardised using sigma scaling. Spearman coefficients were used for correlation analysis and C5.0 decision trees were used for predicting interpretation accuracy using basic interpreter features such as confidence, age, experience and designation. RESULTS: Interpretation accuracies achieved by CFs and non-CFs dropped by 43.20% and 58.95% respectively when an incorrect AD was presented (p < 0.001). Overall correlation between scaled confidence and interpretation accuracy was higher amongst CFs. However, correlation between confidence and interpretation accuracy decreased for both groups when an incorrect AD was presented. We found that an incorrect AD disturbs the reliability of interpreter confidence in predicting accuracy. An incorrect AD has a greater effect on the confidence of non-CFs (although this is not statistically significant it is close to the threshold, p = 0.065). The best C5.0 decision tree achieved an accuracy rate of 64.67% (p < 0.001), however this is only 6.56% greater than the no-information-rate. CONCLUSION: Incorrect ADs reduce the interpreter's diagnostic accuracy indicating an automation bias. Non-CFs tend to agree more with the ADs in comparison to CFs, hence less expert physicians are more effected by automation bias. Incorrect ADs reduce the interpreter's confidence and also reduces the predictive power of confidence for predicting accuracy (even more so for non-CFs). Whilst a statistically significant model was developed, it is difficult to predict interpretation accuracy using machine learning on basic features such as interpreter confidence, age, reader experience and designation.
- MeSH
- automatizace * MeSH
- chybná diagnóza statistika a číselné údaje MeSH
- elektrokardiografie * MeSH
- klinické kompetence * MeSH
- lidé MeSH
- nejistota MeSH
- odchylka pozorovatele MeSH
- rozhodovací stromy MeSH
- srdeční arytmie diagnóza MeSH
- zkreslení výsledků (epidemiologie) MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
INTRODUCTION: Most contemporary 12-lead electrocardiogram (ECG) devices offer computerized diagnostic proposals. The reliability of these automated diagnoses is limited. It has been suggested that incorrect computer advice can influence physician decision-making. This study analyzed the role of diagnostic proposals in the decision process by a group of fellows of cardiology and other internal medicine subspecialties. MATERIALS AND METHODS: A set of 100 clinical 12-lead ECG tracings was selected covering both normal cases and common abnormalities. A team of 15 junior Cardiology Fellows and 15 Non-Cardiology Fellows interpreted the ECGs in 3 phases: without any diagnostic proposal, with a single diagnostic proposal (half of them intentionally incorrect), and with four diagnostic proposals (only one of them being correct) for each ECG. Self-rated confidence of each interpretation was collected. RESULTS: Availability of diagnostic proposals significantly increased the diagnostic accuracy (p<0.001). Nevertheless, in case of a single proposal (either correct or incorrect) the increase of accuracy was present in interpretations with correct diagnostic proposals, while the accuracy was substantially reduced with incorrect proposals. Confidence levels poorly correlated with interpretation scores (rho≈2, p<0.001). Logistic regression showed that an interpreter is most likely to be correct when the ECG offers a correct diagnostic proposal (OR=10.87) or multiple proposals (OR=4.43). CONCLUSION: Diagnostic proposals affect the diagnostic accuracy of ECG interpretations. The accuracy is significantly influenced especially when a single diagnostic proposal (either correct or incorrect) is provided. The study suggests that the presentation of multiple computerized diagnoses is likely to improve the diagnostic accuracy of interpreters.
- MeSH
- chybná diagnóza statistika a číselné údaje MeSH
- elektrokardiografie statistika a číselné údaje MeSH
- kardiologie MeSH
- klinické kompetence statistika a číselné údaje MeSH
- lidé MeSH
- odchylka pozorovatele MeSH
- srdeční arytmie diagnóza MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH