Cảm xúc trong tiếng nói và phân tích thống kê ngữ liệu cảm xúc tiếng Việt

Lê Xuân Thành; Đào Thị Thủy; Trịnh Văn Loan; Nguyễn Hồng Quang

doi:10.32913/mic-ict-research-vn.v1.n35.233

Lê Xuân Thành Hanoi University of Science and Technology
Đào Thị Thủy Hanoi Vocational College of High Technology
Trịnh Văn Loan Hanoi University of Science and Technology
Nguyễn Hồng Quang Hanoi University of Science and Technology

DOI: https://doi.org/10.32913/mic-ict-research-vn.v1.n35.233

Abstract

Research on emotional speech has been carried out for many languages over the world and for Vietnamese, there was a beginning. This paper describes some research results on main features of four basic emotions: happiness, sadness, anger and neutrality. Our preliminary research on emotions of Vietnamese shows that in general anger and happiness correspond to speech energy and fundamental frequency higher than the one of neutral emotion, the sad emotion has the lowest values for energy and fundamental frequency. These comments come from the statistical methods such as analysis of variance (ANOVA) and Tukey’s test applied for our Vietnamese emotion corpus. The classifiers SMO, lBk, trees J48 have been used for preliminary identification of emotions based on BKEmo corpus. The highest recognition rate is 98.17% for the classifier lBk using 384 feature parameters and this rate decreases to 82.59% for the case using only 48 parameters relating to the F0 and intensity.

References

RODDY COWIE, MARC SCHRÖDER, “Piecing together the emotion jigsaw”, Workshop on Machine Learning for Multimodal Interaction (MLMI04), Martigny, Switzerland, June 21-23, 2004.

MARIA SCHUBIGER, “English intonation: its form and function”. Language Vol. 36, No. 4, 1960, pp. 544-548.

KLAUS. R. SCHERER, “Vocal communication of emotion: A review of research paradigms”, Speech Communication, vol. 40, 2003, pp. 227–256.

JANET CAHN, “The generation of affect in synthesized speech”. Journal of American Voice Input/Output Society, vol. 8, 1990, pp. 1–19.

CARL E. WILLIAMS, KENNETH N. STEVENS, “Emotions and speech: Some acoustical correlates”. The Journal of the Acoustical Society of America Vol. 52 (4), 1972, pp. 1238-1250.

FELIX BURKHARDT, WALTER F. SENDLMEIER, “Verification of acoustical correlates of emotional speech using formant-synthesis”. In Proceedings of the ISCA Workshop on Speech and Emotion, Newcastle, Northern Ireland, UK, 2000.

SYLVIE MOZZICONACCI, DIK J. HERMES, “Role of intonation patterns in conveying emotion in speech”. In Proceedings of ICPhS 1999 , San Francisco 1999, pp. 2001-2004.

JANET E. CAHN, “Generating expression in synthesized speech”, Master's Thesis, Massachusetts Institute of Technology, May 1989.

JEAN VROOMEN, RENÉ COLLIER, SYLVIE MOZZICONACCI, "Duration and intonation in emotional speech”, Proceedings of the Third European Conference on Speech Communication and Technology, Berlin, Germany, September 21-23, 1993.

DEEPA P. GOPINATH, SHEEBA P.S, ACHUTHSANKAR S. NAIR, “Emotional Analysis for Malayalam Text to Speech Synthesis Systems”, Proceedings of the Setit 2007 - 4th International Conference: Sciences of Electronic, Technologies of Information and Telecommunications, Tunisia, March 25-29, 2007.

TSANG-LONG PAO, YU-TE CHEN, JUN-HENG YEH, WEN_YUAN LIAO, “Combining acoustic features for improved emotion recognition in mandarin speech”, in ACII (Affective Computing and Intelligent Interaction), Beijing, China, October 22-24, 2005.

FRANK DELLERT, THOMAS POLZIN, ALEX WAIBEL, “Recognising emotions in speech”, ICSLP 96, Philadelphia, USA, Oct 03-06, 1996.

IAIN R. MURRAY, JOHN L. ARNOTT, ELIZABETH A. ROHWER, “Emotional stress in synthetic speech: Progress and future directions”, Speech Communication, vol. 20, Nov 1996, pp. 85-91.

SINÉAD MCGILLOWAY, RODDY COWIE, ELLEN DOUGLAS-COWIE, STAN GIELEN, MACHIEL WESTERDIJK, SYBERT STROEVE “Approaching automatic recognition of emotion from voice: A rough benchmark”, Proceedings of the ISCA Workshop on Speech and Emotion, Newcastle, Northern Ireland, UK, Sep 5-9, 2000.

JAY L. DEVORE, “Probability and Statistics for Engineering and the Sciences”, Eighth Edition, Brooks/Cole Edition, 2010.

YIXIONG PAN, PEIPEI SHEN, LIPING SHEN, “Speech Emotion Recognition Using Support Vector Machine”, International Journal of Smart Home Vol. 6, No. 2, April, 2012, pp 101-108.

R. SUBHASHREE1, G. N. RATHNA, “Speech Emotion Recognition: Performance Analysis based on Fused Algorithms and GMM Modelling”, Indian Journal of Science and Technology, Vol 9(11), March 2016, pp. 1-8.

H. MIWA, T. UMETSU, A. TAKANISHI, H. TAKANOBU, “Robot personalization based on the mental dynamics”, IEEE/RSJ Conference on Intelligent Robots and Systems, vol 1, Takamatsu, Oct 31-Nov 5, 2000.

KUN HAN, DONG YU, IVAN TASHEV, “Speech Emotion Recognition Using Deep Neural Network and Extreme Learning Machine”, INTERSPEECH 2014, Singapore, September 14-18, 2014

THI DUYEN NGO, THE DUY BUI, “A study on prosody of Vietnamese emotional speech”, Proceedings of the Fourth International Conference on Knowledge and Systems Engineering (KSE 2012), IEEE, Danang city, Vietnam, Aug 17-19, 2012

VIET HOANG ANH, MANH NGO VAN, BANG BAN HA, THANG HUYNH QUYET, “A real-time model based Support Vector Machine for emotion recognition through EEG”, International Conference on Control, Automation and Information Sciences (ICCAIS), Ho Chi Minh city, Vietnam, Nov 26-29, 2012.

JOHANNES PITTERMANN, ANGELA PITTERMANN, WOLFGANG MINKER, “Handling Emotions in Human-Computer Dialogues”, Springer, 2010.

DANG-KHOA_MAC, ERIC CASTELLI, VÉRONIQUE AUBERGÉ, “Modeling the Prosody of Vietnamese Attitudes for Expressive Speech Synthesis”, Workshop of Spoken Languages Technologies for Under-resourced Languages (SLTU 2012), Cape Town, South Africa, May 7-9, 2012.

DANG-KHOA MAC, DO-DAT TRAN, “Modeling Vietnamese Speech Prosody: A Step-by-Step Approach Towards an Expressive Speech Synthesis System”, Springer, Trends and Applications in Knowledge Discovery and Data Mining, vol 9441, Springer, 2015, pp. 273-287.

RAHUL B. LANEWAR, SWARUP MATHURKAR, NILESH PATEL, “Implementation and Comparison of Speech Emotion Recognition System using Gaussian Mixture Model (GMM) and K-Nearest Neighbor (K-NN) techniques”, Procedia Computer Science, vol 49, Elsevier, 2015, pp. 50-57.

MOATAZ EL AYADI, MOHAMED S. KAMEL, FAKHRI KARRAY, “Survey on speech emotion recognition: Features, classification schemes, and databases”, Pattern Recognition Journal, vol 44, Issue 3, Elsevier, March 2011, pp 572–587.

www.praat.org, last visited 20/02/2016.

LA VUTUAN, HUANG CHENG-WEI, HA CHENG, ZHAO LI, “Emotional Feature Analysis and Recognition from Vietnamese Speech”, Journal of Signal Processing, China, 2013.

JIANG ZHIPENG, HUANG CHENGWEI, “High-Order Markov Random Fields and Their Applications in Cross-Language Speech Recognition”, Cybernetics and Information Technologies, Volume 15, No 4, Sofia, 2015, pp 50-57.

ROBERT PLUTCHIK, HENRY KELLERMAN, “Emotion: Theory, research and experience”, vol 4. Academic Press, New York, USA, 1989.

NGUYỄN TÔN NHAN, PHÚ VĂN HẲN, “Từ điển tiếng Việt”, Nhà xuất bản Từ điển Bách Khoa, 2013.

JOHN C. PLATT, “Technical Report MSR-TR-98-14”, Microsoft Research, April 21, 1998

QUINLAN, J. R. “C4.5: Programs for Machine Learning”, Morgan Kaufmann Publishers, 1993.

WITTEN, IAN H., AND EIBE FRANK, “Data Mining: Practical machine learning tools and techniques”, Morgan Kaufmann Publishers, 2005.

EYBEN, FLORIAN, MARTIN WÖLLMER, AND BJÖRN SCHULLER, "Opensmile: the munich versatile and fast open-source audio feature extractor", Proceedings of the 18th ACM international conference on Multimedia, Firenze, Italia, Oct 25-29, 2010.

SIQING WUA, TIAGO H. FALKB, WAI-YIP CHAN, “Automatic speech emotion recognition using modulation spectral features”, Speech Communication, Volume 53, Issue 5, 2011, pp. 768–785.

S. LALITHA, ABHISHEK MADHAVAN, BHARATH BHUSHAN, SRINIVAS SAKETH, “Speech emotion recognition”, Proceedings of the International Conference on Advances in Electronics, Computers and Communications, Bangalore, India, Oct 10-11, 2014.

MARTIN GJORESKI, HRISTIJAN GJORESKI, ANDREA KULAKOV, “Machine Learning Approach for Emotion Recognition in Speech”, Informatica, vol 38, no 4, 2014, pp. 377-384.

ANKUSH CHAUDHARY,ASHISH KUMAR SHARMA, JYOTI DALAL, LEENA CHOUKIKER, “Speech Emotion Recognition”, Journal of Emerging Technologies and Innovative Research, vol. 2, issue 4, 2015, pp 1169-1171.

Cảm xúc trong tiếng nói và phân tích thống kê ngữ liệu cảm xúc tiếng Việt

Abstract

References

Most read articles by the same author(s)

AIM, SCOPE, INDEXING

EDITORIAL BOARD