Supervised Deep Hashing for Efficient Audio Retrieval

Subscribers:
344,000
Published on ● Video Link: https://www.youtube.com/watch?v=yg-Hbu9GbRs



Duration: 1:03:38
1,590 views
39


Audio Event Classification (AEC) is defined as the inherent ability of machines to assign a semantic label to a given audio segment. In spite of multiple efforts in learning better and robust audio representations (or embeddings), there has not been adequate amount of research in efficient retrieval of audio events. Fast retrieval can facilitate near-real-time similarity search between a query sound and a database containing millions of audio events.

This work, the first of its kind, investigates the potency of different hashing techniques for efficient audio event retrieval. We employ state-of-the-art audio embeddings as features. We analyze the performance of some classical unsupervised hashing algorithms. Then we show that employing a small portion of the annotated database for supervised hashing via Deep Quantization Network (DQN) can significantly boost the retrieval performance. The detailed experimental results, extensive analysis and comparison between supervised and unsupervised hashing methods can provide deep insights on the quantizability of the employed audio embeddings, and further allow performance evaluation of such an audio retrieval system.

See more at https://www.microsoft.com/en-us/research/video/supervised-deep-hashing-for-efficient-audio-retrieval/




Other Videos By Microsoft Research


2020-06-14Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training
2020-06-14Memory Enhanced Global Local Aggregation for Video Object Detection
2020-06-14Dynamic Convolution: Attention over Convolution Kernels
2020-06-14Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning
2020-06-14Women in CV
2020-06-14RoutedFusion: Learning Real-time Depth Map Fusion
2020-06-14Multi Granularity Reference Aided Attentive Feature Aggregation for Video based Person Re identifica
2020-06-14Hyper-STAR: Task-Aware Hyperparameters for Deep Networks
2020-06-14Violin: A Large-Scale Dataset for Video-and-Language Inference
2020-06-14Local Context Normalization: Revisiting Local Normalization
2020-06-10Supervised Deep Hashing for Efficient Audio Retrieval
2020-06-10Transparency and Intelligibility Throughout the Machine Learning Life Cycle
2020-06-10Machine Learning and Fairness Webinar
2020-06-10Consumer Brain-Computer Interfaces: From Science Fiction to Reality
2020-06-10Highly Conductive Flexible Sensor Integrated With Personal Devices For Practical Bio-Signal Measure
2020-06-08Microsoft Build 2020: Kevin Scott keynote with Lila Tretikov
2020-06-03Harvesting randomness, HAIbrid algorithms and safe AI with Dr. Siddhartha Sen | Podcast
2020-06-03Provably efficient reinforcement learning with Dr. Akshay Krishnamurthy | Podcast
2020-06-01What ‘bhasha’ do you want to talk in? With Kalika Bali and Dr. Monojit Choudhury | Podcast
2020-05-26Explaining Decisions from Vision Models and Correcting them via Human Feedback
2020-05-26Auditing Outsourced Services



Tags:
Audio Event Classification (AEC)
audio and acoustics research
efficient retrieval of audio events
hashing techniques
supervised hashing
Deep Quantization Network (DQN)
Arindam Jati
Dimitra Emmanouilidou
Microsoft Research