Emotion Recognition in Speech Signal: Experimental Study, Development and Applications

Subscribers:
344,000
Published on ● Video Link: https://www.youtube.com/watch?v=whkJwLjyBWY



Duration: 1:23:00
691 views
3


In this talk I will overview my research on emotion expression and emotion recognition in speech signal and its applications. Two proprietary databases of emotional utterances were used in this research. The first database consists of 700 emotional utterances in English pronounced by 30 subjects portraying five emotional states: unemotional (normal), anger, happiness, sadness, and fear. The second database consists of 3660 emotional utterances in Russian by 61 subjects portraying the following six emotional states: unemotional, anger, happiness, sadness, fear and surprise. An experimental study has been conducted to determine how well people recognize emotions in speech. Based on the results of the experiment the most reliable utterances were selected for feature selection and for training recognizers. Several machine learning techniques have been applied to create recognition agents including k-nearest neighbor, neural networks, and ensembles of neural networks. The agents can recognize five emotional states with the following accuracy: normal or unemotional state - 55-75, anger - 70-80, and fear - 35-55. The agents can be adapted to a particular environment depending on parameters of speech signal and the number of target emotional states. For a practical application an agent has been created that is able to analyze telephone quality speech signal and distinguish between two emotional states (agitation which includes anger, happiness and fear, and calm which includes normal state and sadness) with the accuracy 77. The agent was used as a part of a decision support system for prioritizing voice messages and assigning a proper human agent to response the message at call center environment. I will also give a summary of other research topics in the lab including fast pitch-synchronous segmentation of speech signal, the use of speech analysis techniques for language learning and video clip recognition using a joint audio-visual model.




Other Videos By Microsoft Research


2016-09-05Structural Comparison of Executable Objects
2016-09-05Indifference is Death: Responsibility, Leadership, & Innovation
2016-09-05TQFTs and tight contact structures on 3-manifolds      
2016-09-05Wireless Embedded Networks/The Ecosystem and Cool Challenges
2016-09-05Data Mining & Machine Learning to empower business strategy
2016-09-05Some uses of orthogonal polynomials
2016-09-05Approximation Algorithms for Embedding with Extra Information and Ordinal Relaxation
2016-09-05Evaluating Retrieval System Effectiveness
2016-09-05Exploiting the Transients of Adaptation for RoQ Attacks on Internet Resources
2016-09-05Specification-Based Annotation Inference
2016-09-05Emotion Recognition in Speech Signal: Experimental Study, Development and Applications
2016-09-05Text summarization: News and Beyond
2016-09-05Data Streaming Algorithms for Efficient and Accurate Estimation of Flow Size Distribution
2016-09-05Learning and Inferring Transportation Routines
2016-09-05Raising the Bar: Integrity and Passion in Life and Business: The Story of Clif Bar, Inc.
2016-09-05Revelationary Computing, Proactive Displays and The Experience UbiComp Project
2016-09-05The Design of A Formal Property-Specification Language
2016-09-05Data Harvesting: A Random Coding Approach to Rapid Dissemination and Efficient Storage of Data
2016-09-05Runtime Refinement Checking for Concurrent Data Structures
2016-09-05Lost in Space: The Fall of NASA and the Dream of a New Space Age
2016-09-05Solving Geometric Matching Problems using Interval Arithmetic Optimization



Tags:
microsoft research