Nejvíce citovaný článek - PubMed ID 37801807
Speech and language technologies are effective tools for identifying the distinct speech changes associated with Parkinson's disease (PD), enabling earlier and more accurate diagnosis. Models leveraging recent advancements in self-supervised speech pretraining, such as Wav2Vec, have demonstrated superior performance over traditional feature extraction methods. While Wav2Vec 2.0 has been successfully utilized for PD detection, a rigorous quantitative comparison with Wav2Vec 1.0 is needed to comprehensively evaluate its advantages, limitations, and applicability across different speech modes in PD. This study presents a systematic comparison of Wav2Vec 1.0 and Wav2Vec 2.0 embeddings across three multilingual datasets using various classification approaches to classify normal (healthy controls; HC) and PD-affected speech. Additionally, both Wav2Vec 1.0 and 2.0 were benchmarked against traditional baseline features across diverse linguistic contexts, including spontaneous speech, non-spontaneous speech, and isolated vowels. A multicriteria TOPSIS approach was employed to rank feature extraction methods, revealing that Wav2Vec 2.0 excelled across speech modes, with its first transformer layer demonstrating the best performance for classifying read text and monologue, and its feature extractor performing best in vowel-based classification. In contrast, Wav2Vec 1.0, while generally outperformed by Wav2Vec 2.0, still provided a more efficient alternative with competitive performance. Finally, we combined selected layers from both architectures and have demonstrated improved diagnostic accuracy in vowel-based classification. This comparative analysis underscores the strengths of both Wav2Vec architectures and informs their optimal use in PD detection.
- Klíčová slova
- Classification, Parkinson's disease, Speech modes, Wav2vec 1.0, Wav2vec 2.0,
- Publikační typ
- časopisecké články MeSH
Advancements in deep learning speech representations have facilitated the effective use of extensive unlabeled speech datasets for Parkinson's disease (PD) modeling with minimal annotated data. This study employs the non-fine-tuned wav2vec 1.0 architecture to develop machine learning models for PD speech diagnosis tasks, such as cross-database classification and regression to predict demographic and articulation characteristics. The primary aim is to analyze overlapping components within the embeddings on both classification and regression tasks, investigating whether latent speech representations in PD are shared across models, particularly for related tasks. Firstly, evaluation using three multi-language PD datasets showed that wav2vec accurately detected PD based on speech, outperforming feature extraction using mel-frequency cepstral coefficients in the proposed cross-database classification scenarios. In cross-database scenarios using Italian and English-read texts, wav2vec demonstrated performance comparable to intra-dataset evaluations. We also compared our cross-database findings against those of other related studies. Secondly, wav2vec proved effective in regression, modeling various quantitative speech characteristics related to articulation and aging. Ultimately, subsequent analysis of important features examined the presence of significant overlaps between classification and regression models. The feature importance experiments discovered shared features across trained models, with increased sharing for related tasks, further suggesting that wav2vec contributes to improved generalizability. The study proposes wav2vec embeddings as a next promising step toward a speech-based universal model to assist in the evaluation of PD.
- Klíčová slova
- Parkinson’s disease, classification, cross-database, feature importance, regression, wav2vec,
- MeSH
- databáze faktografické * MeSH
- deep learning MeSH
- lidé středního věku MeSH
- lidé MeSH
- Parkinsonova nemoc * patofyziologie MeSH
- řeč * fyziologie MeSH
- senioři MeSH
- strojové učení MeSH
- Check Tag
- lidé středního věku MeSH
- lidé MeSH
- mužské pohlaví MeSH
- senioři MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH