Semi-supervised Multi-task learning for acoustic parameter estimation

Subscribers:
342,000
Published on ● Video Link: https://www.youtube.com/watch?v=U5hcROugVkM



Duration: 29:33
374 views
0


Speakers: Haleh Akrami
Host: Hannes Gamper

The acoustic properties of the room might impact the quality of speech audio for the listeners. Basic acoustic parameters that characterize the environment well are reverberation time, (RT60) is defined by the time it takes for the sound energy to decay after the source is switched off and clarity (C50/C80), which is measured by calculating the ratio between the early reflections' energy (up to 50/80ms) and the energy of the late response from the decay curve. Another important feature of speech is Speech quality. Many algorithms have been developed for speech enhancement and removing noise, echo, and reverberation. But these algorithms do not necessarily improve the speech quality accessed by human perception. The mean opinion score (MOS) is standardized for the perceptual evaluation of speech quality and is obtained by asking listeners to evaluate the quality of an audio signal on a scale from one to five. We are interested in the estimation of RT60 and C50 and MOS in a multi-task framework. We combined different data sets which are partially labeled and applied a semi-supervised approach to estimate RT60 and C50 and MOS simultaneously.

Learn more: https://www.microsoft.com/en-us/research/video/semi-supervised-multi-task-learning-for-acoustic-parameter-estimation/




Other Videos By Microsoft Research


2023-12-05AI Forum 2023 | AI for Neurodiverse Society
2023-12-05AI Forum 2023 | Bridging Disciplines: Exploring the Frontiers of New Computing Paradigms
2023-12-05AI Forum 2023 | Innovating Intelligent Environments for Wireless Communication & Sensing
2023-12-05AI Forum 2023 | Towards Responsible AI Deployment
2023-12-05AI Forum 2023 | AI4Science: Accelerating Scientific Discovery with Artificial Intelligence
2023-12-05AI Forum 2023 | Harnessing AI for a Greener Tomorrow
2023-12-05AI Forum 2023 | Panel Discussion “AI Synergy: Science and Society”
2023-12-05AI Forum 2023 | Future of Foundation Models
2023-11-30PwR: Using representations for AI-powered software development
2023-11-10Binaural spatial audio positioning in video calls
2023-11-10Semi-supervised Multi-task learning for acoustic parameter estimation
2023-11-10Research intern talk: Real-time single-channel speech separation in noisy & reverberant environments
2023-11-10Research intern talk: Unified speech enhancement approach for speech degradation & noise suppression
2023-11-10Synchronized Audio-Visual Generation with a Joint Generative Diffusion Model and Contrastive Loss
2023-11-09Supporting the Responsible AI Red-Teaming Human Infrastructure | Jina Suh
2023-11-08Project Mosaic
2023-11-02Supporting the Responsible AI Red-Teaming Human Infrastructure | Jina Suh
2023-11-02Sociotechnical Approaches to Measuring Harms Caused by AI Systems | Hanna Wallach
2023-11-02Storytelling and futurism | Matt Corwine
2023-11-02Regulatory Innovation to Enable Use of Generative AI in Drug Development | Stephanie Simmons
2023-11-02AI Powered Community Micro-Grid for Resiliency and Equitability | Peeyush Kumar