Locality-Sensitive Hashing and Beyond

Subscribers:
343,000
Published on ● Video Link: https://www.youtube.com/watch?v=t_8SpFV0l7A



Duration: 54:28
8,046 views
94


Locality-Sensitive Hashing (LSH) is a powerful technique for the approximate nearest neighbor search (ANN) in high dimensions. In this talk I will present two recent results. I will show a data structure for ANN for the Euclidean distance that provably outperforms the best possible LSH-based data structure. We proceed via designing a good data-dependent hash family. I will show a practical and optimal LSH family for the cosine similarity (a.k.a. Euclidean distance on a sphere). It substantially outperforms the celebrated Hyperplane LSH family. Along the way, I will try to debunk two popular myths about LSH: - LSH-based data structures consume too much memory and are thus impractical; - Optimal LSH constructions are too complicated to be made practical. The talk is based on two papers: arXiv:1501.01062 (joint with Alexandr Andoni, STOC 2015) and arXiv:1509.02897 (joint with Alexandr Andoni, Piotr Indyk, Thijs Laarhoven and Ludwig Schmidt, NIPS 2015).




Other Videos By Microsoft Research


2016-06-22How Much Information Does a Human Translator Add to the Original and Multi-Source Neural Translation
2016-06-22Generative probabilistic programming: applications and new ideas
2016-06-22Making Small Spaces Feel Large: Practical Illusions in Virtual Reality
2016-06-22Python+Machine Learning tutorial - Working with textual data
2016-06-22Coping with the Intractability of Graphical Models
2016-06-22Symposium: Algorithms Among Us - Intro & Tom Dietterich
2016-06-22Gates Foundation Presents: Crucial Areas of Fintech Innovation for the Bottom of the Pyramid
2016-06-22Symposium: Brains, Minds and Machines - Andrew Saxe
2016-06-22Machine Learning in Health Care
2016-06-22Human factors of software updates
2016-06-22Locality-Sensitive Hashing and Beyond
2016-06-22Oral Session: Competitive Distribution Estimation: Why is Good-Turing Good
2016-06-22Learning to Control
2016-06-22Oral Session: Deep Visual Analogy-Making
2016-06-22Symposium: Brains, Minds and Machines - Joshua Tenenbaum
2016-06-22Oral Session: Probabilistic Line Searches for Stochastic Optimization
2016-06-22Intelligent Control of Crowdsourcing
2016-06-22Python+Machine Learning tutorial - Data munging for predictive modeling with pandas and scikit-learn
2016-06-22Oral Session: Efficient Exact Gradient Update for training Deep Networks with Large Sparse Targets
2016-06-22IMS-Microsoft Research Workshop: Foundations of Data Science - Cyberspace, the Final Frontier
2016-06-22Bringing Harmony Through AI and Economics



Tags:
microsoft research
lsh
computer systems and networking