Similarity Search: A Web Perspective
Google Tech Talks
October, 18 2007
ABSTRACT
Similarity search is the problem of preprocessing a database of N objects in such a way that given a query object, one can effectively determine its nearest neighbors in database. "Geometric near-neighbor access tree" data structure, an early work (1995) by Sergey Brin, is one of the most known solutions to this problem.
Similarity search is closely connected to many algorithmic problems in the web. Similarity search is an abstraction of many algorithmic problems we face in data management. In this talk we will focus on:
- Personalized news aggregation: Searching for news articles that are most similar to the user's profile of interests
- Behavioral targeting: Searching for the most relevant advertisement for displaying to a given user.
- Social network analysis: Suggesting new friends.
- Computing co-occurrence similarities.
- "Best match search": Searching resumes, jobs, BF/GF, cars, apartments.
We describe features that make web applications somewhat different from previously studied models. Thus we re-examine the formalization and the classical algorithms for similarity search. This leads us to new algorithms (we present two of them) and numerous open problems in the field.
Speaker: Yury Lifshits
Yury Lifshits obtained his PhD degree from Steklov Institute of Mathematics at S...
Other Videos By Google TechTalks
2007-11-02 | Building Industrial Strength Performance Tools |
2007-11-02 | Dryad: A general-purpose distributed execution platform |
2007-11-02 | Wuala - a distributed file system |
2007-10-27 | Unleashing Video Search |
2007-10-26 | What have We Learned from Market Design? |
2007-10-26 | The Web That Wasn't |
2007-10-26 | XWiki: the french open source cousin of JotSpot |
2007-10-26 | Git |
2007-10-26 | Implementing Drupal |
2007-10-25 | (re-)Organizing and (non-)transactions |
2007-10-24 | Similarity Search: A Web Perspective |
2007-10-23 | THIRST: How to Dig a Well |
2007-10-19 | PhotoTechEDU Day 29: Photographing VR Panoramas |
2007-10-19 | Inside VMware Fusion |
2007-10-18 | sex on the internet, the realities of porn, sexual privacy, |
2007-10-17 | Making Great Decisions |
2007-10-09 | Distributed Testing with SmartFrog |
2007-10-09 | Data Representation/Laplace Operator |
2007-10-08 | DSD: A Hybrid Analysis Tool for Bug Finding |
2007-10-08 | Java Posse Episode 100 (Live Special) |
2007-10-08 | Practical Common Lisp |