Pattern Recognition Methods and Features Selection for Speech Emotion Recognition System
Language English Country United States Media print-electronic
Document type Journal Article
PubMed
26346654
PubMed Central
PMC4539500
DOI
10.1155/2015/573068
Knihovny.cz E-resources
- MeSH
- Algorithms * MeSH
- Databases, Factual MeSH
- Emotions physiology MeSH
- Voice Quality MeSH
- Humans MeSH
- Neural Networks, Computer MeSH
- Signal Processing, Computer-Assisted instrumentation MeSH
- Speech physiology MeSH
- ROC Curve MeSH
- Pattern Recognition, Automated * MeSH
- Pattern Recognition, Physiological physiology MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
The impact of the classification method and features selection for the speech emotion recognition accuracy is discussed in this paper. Selecting the correct parameters in combination with the classifier is an important part of reducing the complexity of system computing. This step is necessary especially for systems that will be deployed in real-time applications. The reason for the development and improvement of speech emotion recognition systems is wide usability in nowadays automatic voice controlled systems. Berlin database of emotional recordings was used in this experiment. Classification accuracy of artificial neural networks, k-nearest neighbours, and Gaussian mixture model is measured considering the selection of prosodic, spectral, and voice quality features. The purpose was to find an optimal combination of methods and group of features for stress detection in human speech. The research contribution lies in the design of the speech emotion recognition system due to its accuracy and efficiency.
See more in PubMed
Zarkowski M. Identification-driven emotion recognition system for a social robot. Proceedings of the 18th International Conference on Methods and Models in Automation and Robotics (MMAR '13); August 2013; pp. 138–143.
Bakhshi S., Shamma D., Gilbert E. Faces engage us: photos with faces attract more likes and comments on Instagram. Proceedings of the 32nd Annual ACM Conference on Human Factors in Computing Systems (CHI '14); 2014; New York, NY, USA. ACM; pp. 965–974. DOI
Ahad M. A. R. Motion History Images for Action Recognition and Understanding. London, UK: Springer; 2013.
El Ayadi M., Kamel M. S., Karray F. Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recognition. 2011;44(3):572–587. doi: 10.1016/j.patcog.2010.09.020. DOI
Koolagudi S. G., Rao K. S. Emotion recognition from speech: a review. International Journal of Speech Technology. 2012;15(2):99–117. doi: 10.1007/s10772-011-9125-1. DOI
Voznak M., Rezac F., Rozhon J. Speech quality monitoring in Czech national research network. Advances in Electrical and Electronic Engineering. 2010;8(5):114–117.
Partila P., Voznak M., Mikulec M., Zdralek J. Fundamental frequency extraction method using central clipping and its importance for the classification of emotional state. Advances in Electrical and Electronic Engineering. 2012;10(4):270–275.
Eyben F., Weninger F., Wollmer M., Schuller B. openSMILE—the Munich open Speech and Music Interpretation by Large Space Extraction toolk it, TU Munchen, 2013, http://opensmile.sourceforge.net/
Neuberger T., Beke A. Automatic laughter detection in spontaneous speech Using GMM-SVM method. (Lecture Notes in Computer Science).Text, Speech, and Dialogue. 2013;8082:113–120. doi: 10.1007/978-3-642-40585-3_15. DOI
Krajewski J., Schnieder S., Sommer D., Batliner A., Schuller B. Applying multiple classifiers and non-linear dynamics features for detecting sleepiness from speech. Neurocomputing. 2012;84:65–75. doi: 10.1016/j.neucom.2011.12.021. DOI
Ntalampiras S., Fakotakis N. Modeling the temporal evolution of acoustic parameters for speech emotion recognition. IEEE Transactions on Affective Computing. 2012;3(1):116–125. doi: 10.1109/T-AFFC.2011.31. DOI
Hu H., Xu M.-X., Wu W. GMM supervector based SVM with spectral features for speech emotion recognition. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '07); April 2007; Honolulu, Hawaii, USA. pp. IV-413–IV-416. DOI
Davis J., Goadrich M. The relationship between precision-recall and ROC curves. Proceedings of the 23rd International Conference on Machine Learning (ICML '06); 2006; pp. 233–240. DOI