Automatic Evaluation of Voice Quality Using Text-Based Laryngograph Measurements and Prosodic Analysis
Jazyk angličtina Země Spojené státy americké Médium print-electronic
Typ dokumentu časopisecké články, práce podpořená grantem
PubMed
26136813
PubMed Central
PMC4468283
DOI
10.1155/2015/316325
Knihovny.cz E-zdroje
- MeSH
- chrapot diagnóza MeSH
- dítě MeSH
- dospělí MeSH
- kvalita hlasu * MeSH
- lidé středního věku MeSH
- lidé MeSH
- mladiství MeSH
- mladý dospělý MeSH
- percepce řeči MeSH
- počítačové zpracování signálu * MeSH
- poruchy hlasu diagnóza MeSH
- řeč * MeSH
- řečová terapie MeSH
- regresní analýza MeSH
- reprodukovatelnost výsledků MeSH
- senioři nad 80 let MeSH
- senioři MeSH
- software MeSH
- zvuková spektrografie metody MeSH
- Check Tag
- dítě MeSH
- dospělí MeSH
- lidé středního věku MeSH
- lidé MeSH
- mladiství MeSH
- mladý dospělý MeSH
- mužské pohlaví MeSH
- senioři nad 80 let MeSH
- senioři MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
Due to low intra- and interrater reliability, perceptual voice evaluation should be supported by objective, automatic methods. In this study, text-based, computer-aided prosodic analysis and measurements of connected speech were combined in order to model perceptual evaluation of the German Roughness-Breathiness-Hoarseness (RBH) scheme. 58 connected speech samples (43 women and 15 men; 48.7 ± 17.8 years) containing the German version of the text "The North Wind and the Sun" were evaluated perceptually by 19 speech and voice therapy students according to the RBH scale. For the human-machine correlation, Support Vector Regression with measurements of the vocal fold cycle irregularities (CFx) and the closed phases of vocal fold vibration (CQx) of the Laryngograph and 33 features from a prosodic analysis module were used to model the listeners' ratings. The best human-machine results for roughness were obtained from a combination of six prosodic features and CFx (r = 0.71, ρ = 0.57). These correlations were approximately the same as the interrater agreement among human raters (r = 0.65, ρ = 0.61). CQx was one of the substantial features of the hoarseness model. For hoarseness and breathiness, the human-machine agreement was substantially lower. Nevertheless, the automatic analysis method can serve as the basis for a meaningful objective support for perceptual analysis.
Zobrazit více v PubMed
Maryn Y., Roy N. Sustained vowels and continuous speech in the auditory-perceptual evaluation of dysphonia severity. Jornal da Sociedade Brasileira de Fonoaudiologia. 2012;24(2):107–112. doi: 10.1590/S2179-64912012000200003. PubMed DOI
Oates J. Auditory-perceptual evaluation of disordered voice quality: pros, cons and future directions. Folia Phoniatrica et Logopaedica. 2009;61(1):49–56. doi: 10.1159/000200768. PubMed DOI
Kempster G. B., Gerratt B. R., Abbott K. V., Barkmeier-Kraemer J., Hillman R. E. Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol. American Journal of Speech-Language Pathology. 2009;18(2):124–132. doi: 10.1044/1058-0360(2008/08-0017). PubMed DOI
Kreiman J., Gerratt B. R. The perceptual structure of pathologic voice quality. Journal of the Acoustical Society of America. 1996;100(3):1787–1795. doi: 10.1121/1.416074. PubMed DOI
Hirano M. Clinical Examination of Voice. New York, NY, USA: Springer; 1981.
Nawka T., Anders L.-C., Wendler J. Die auditive Beurteilung heiserer Stimmen nach dem RBH-System. Sprache Stimme Gehör. 1994;18(3):130–133.
Maryn Y., Roy N., De Bodt M., van Cauwenberge P., Corthals P. Acoustic measurement of overall voice quality: a meta-analysis. Journal of the Acoustical Society of America. 2009;126(5):2619–2634. doi: 10.1121/1.3224706. PubMed DOI
Parsa V., Jamieson D. G. Acoustic discrimination of pathological voice: sustained vowels versus continuous speech. Journal of Speech, Language, and Hearing Research. 2001;44(2):327–339. doi: 10.1044/1092-4388(2001/027). PubMed DOI
de Krom G. Some spectral correlates of pathological breathy and rough voice quality for different types of vowel fragments. Journal of Speech and Hearing Research. 1995;38(4):794–811. doi: 10.1044/jshr.3804.794. PubMed DOI
Laver J. The Phonetic Description of Voice Quality. Cambridge, UK: Cambridge University Press; 1980.
Revis J., Giovanni A., Triglia J.-M. Influence de l’attaque sur l’analyse perceptive des dysphonies. Folia Phoniatrica et Logopaedica. 2002;54(1):19–25. doi: 10.1159/000048593. PubMed DOI
Revis J., Giovanni A., Wuyts F., Triglia J.-M. Comparison of different voice samples for perceptual analysis. Folia Phoniatrica et Logopaedica. 1999;51(3):108–116. doi: 10.1159/000021485. PubMed DOI
Titze I. R. Workshop on Acoustic Voice Analysis: Summary Statement. Denver, Colo, USA: National Center for Voice and Speech; 1995.
Fourcin A. Aspects of voice irregularity measurement in connected speech. Folia Phoniatrica et Logopaedica. 2009;61(3):126–136. doi: 10.1159/000219948. PubMed DOI
Fourcin A. J., Abberton E. First applications of a new laryngograph. Medical & Biological Illustration. 1971;21(3):172–182. PubMed
Zeißler V., Adelhardt J., Batliner A., et al. The prosody module. In: Wahlster W., editor. SmartKom: Foundations of Multimodal Dialogue Systems. Berlin, Germany: Springer; 2006. pp. 139–152.
Haderlein T., Nöth E., Toy H., et al. Automatic evaluation of prosodic features of tracheoesophageal substitute voice. European Archives of Oto-Rhino-Laryngology. 2007;264(11):1315–1321. doi: 10.1007/s00405-007-0363-4. PubMed DOI
Haderlein T., Moers C., Möbius B., Nöth E. Text, Speech and Dialogue: 15th International Conference, TSD 2012, Brno, Czech Republic, September 3–7, 2012. Proceedings. Vol. 7499. Berlin, Germany: Springer; 2012. Automatic rating of hoarseness by text-based cepstral and prosodic evaluation; pp. 573–580. DOI
Dejonckere P. H., Bradley P., Clemente P., et al. A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques. Guideline elaborated by the Committee on Phoniatrics of the European Laryngological Society (ELS) European Archives of Oto-Rhino-Laryngology. 2001;258(2):77–82. doi: 10.1007/s004050000299. PubMed DOI
International Phonetic Association. Handbook of the International Phonetic Association. Cambridge, UK: Cambridge University Press; 1999.
Fourcin A., Abberton E., Miller D., Howells D. Laryngography speech pattern element tools for therapy, training and assessment. European Journal of Disorders of Communication. 1995;30(2):101–115. doi: 10.3109/13682829509082521. PubMed DOI
Fairbanks G. Voice and Articulation Drillbook. 2nd. New York, NY, USA: Harper; 1960.
Aronson A. E., Bless D. M. Clinical Voice Disorders. 4th. New York, NY, USA: Thieme; 2009.
Koreman J., Pützer M., Just M. Correlates of varying vocal fold adduction deficiencies in perception and production: methodological and practical considerations. Folia Phoniatrica et Logopaedica. 2004;56(5):305–320. doi: 10.1159/000080067. PubMed DOI
Pützer M., Barry W. J. Methodische Aspekte der auditiven Beurteilung von Stimmqualität. Sprache-Stimme-Gehör. 2004;28(4):188–197. doi: 10.1055/s-2004-835866. DOI
Ptok M., Schwemmle C., Iven C., Jessen M., Nawka T. Zur auditiven Bewertung der Stimmqualität. HNO. 2006;54(10):793–802. doi: 10.1007/s00106-005-1310-6. PubMed DOI
Haderlein T., Riedhammer K., Nöth E., et al. Application of automatic speech recognition to quantitative assessment of tracheoesophageal speech with different signal quality. Folia Phoniatrica et Logopaedica. 2009;61(1):12–17. doi: 10.1159/000187620. PubMed DOI
Batliner A., Buckow J., Niemann H., Nöth E., Warnke V. The prosody module. In: Wahlster W., editor. Verbmobil: Foundations of Speech-to-Speech Translation. Berlin, Germany: Springer; 2000. pp. 106–121. DOI
Batliner A., Fischer K., Huber R., Spilker J., Nöth E. How to find trouble in communication. Speech Communication. 2003;40(1-2):117–143. doi: 10.1016/S0167-6393(02)00079-1. DOI
Smola A. J., Schölkopf B. A tutorial on support vector regression. Statistics and Computing. 2004;14(3):199–222. doi: 10.1023/b:stco.0000035301.49549.88. DOI
Witten I., Frank E. Data Mining: Practical Machine Learning Tools and Techniques. 2nd. San Francisco, Calif, USA: Morgan Kaufmann; 2005.
Maier A. Speech of Children with Cleft Lip and Palate: Automatic Assessment. Dissertation. Vol. 29. Berlin, Germany: Logos; 2009. (Studien zur Mustererkennung).
Krippendorff K. Content Analysis: An Introduction to Its Methodology. Thousand Oaks, Calif, USA: Sage; 2004.
Moers C., Möbius B., Rosanowski F., Nöth E., Eysholdt U., Haderlein T. Vowel- and text-based cepstral analysis of chronic hoarseness. Journal of Voice. 2012;26(4):416–424. doi: 10.1016/j.jvoice.2011.05.001. PubMed DOI
Ptok M., Iven C., Jessen M., Schwemmle C. Objektiv gemessene Stimmlippenschwingungsirregularität vs. subjektiver Eindruck der Rauigkeit. HNO. 2006;54(2):132–138. doi: 10.1007/s00106-005-1250-1. PubMed DOI
Bagshaw P. C., Hiller S. M., Jack M. A. Enhanced pitch tracking and the processing of F0 contours for computer aided intonation teaching. Proceedings of the European Conference on Speech Communication and Technology (Eurospeech '93); 1993; pp. 1003–1006.
Carding P. N., Steen I. N., Webb A., Mackenzie K., Deary I. J., Wilson J. A. The reliability and sensitivity to change of acoustic measures of voice quality. Clinical Otolaryngology and Allied Sciences. 2004;29(5):538–544. doi: 10.1111/j.1365-2273.2004.00846.x. PubMed DOI
Hillenbrand J. Perception of aperiodicities in synthetically generated voices. Journal of the Acoustical Society of America. 1988;83(6):2361–2371. doi: 10.1121/1.396367. PubMed DOI
Hartl D. M., Hans S., Vaissière J., Brasnu D. F. Objective acoustic and aerodynamic measures of breathiness in paralytic dysphonia. European Archives of Oto-Rhino-Laryngology. 2003;260(4):175–182. PubMed
Bele I. V. Reliability in perceptual analysis of voice quality. Journal of Voice. 2005;19(4):555–573. doi: 10.1016/j.jvoice.2004.08.008. PubMed DOI
Caballero-Morales S.-O. Estimation of phoneme-specific HMM topologies for the automatic recognition of dysarthric speech. Computational and Mathematical Methods in Medicine. 2013;2013:15. doi: 10.1155/2013/297860.297860 PubMed DOI PMC