Reproducible Machine Learning-Based Voice Pathology Detection: Introducing the Pitch Difference Feature
Status Publisher Jazyk angličtina Země Spojené státy americké Médium print-electronic
Typ dokumentu časopisecké články
PubMed
40221253
DOI
10.1016/j.jvoice.2025.03.028
PII: S0892-1997(25)00122-5
Knihovny.cz E-zdroje
- Klíčová slova
- Voice pathology detection—Voice disorder detection—Saarbrücken Voice Database—SVD—Machine learning—REFORMS,
- Publikační typ
- časopisecké články MeSH
PURPOSE: We introduce a novel methodology for voice pathology detection using the publicly available Saarbrücken Voice Database and a robust feature set combining commonly used acoustic handcrafted features with two novel ones: pitch difference (relative variation in fundamental frequency) and NaN feature (failed fundamental frequency estimation). METHODS: We evaluate six machine learning (ML) algorithms-support vector machine, k-nearest neighbors, naive Bayes, decision tree, random forest, and AdaBoost-using grid search for feasible hyperparameters and 20 480 different feature subsets. Top 1000 classification models-feature subset combinations for each ML algorithm are validated with repeated stratified cross-validation. To address class imbalance, we apply k-means synthetic minority oversampling technique to augment the training data. RESULTS: Our approach achieves 85.61%, 84.69%, and 85.22% unweighted average recall for females, males, and combined results, respectively. We intentionally omit accuracy as it is a highly biased metric for imbalanced data. CONCLUSION: Our study demonstrates that by following the proposed methodology and feature engineering, there is a potential in detection of various voice pathologies using ML models applied to the simplest vocal task, a sustained utterance of the vowel /a:/. To enable easier use of our methodology and to support our claims, we provide a publicly available GitHub repository with DOI 10.5281/zenodo.13771573. Finally, we provide a REFORMS checklist to enhance readability, reproducibility, and justification of our approach.
Citace poskytuje Crossref.org