Random Sampling from a Search Engine's Index

Subscribers:
348,000
Published on ● Video Link: https://www.youtube.com/watch?v=b7PH78fnsmw



Duration: 1:01:32
4,022 views
6


Google Tech Talks
August 17, 2006

Ziv Bar-Yossef joined Google from the Technion - Israel Institute of Technology in Haifa, Israel. He received his PhD from UC Berkeley in 2002, and was a Research Staff Member at the IBM Almaden Research Center prior to joining Technion.

This was an academic study conducted before Ziv Bar-Yossef joined Google and does not represent Google's views.

ABSTRACT
We revisit a problem introduced by Bharat and Broder almost a decade ago: how to sample random pages from a search engine's index using only the search engine's public interface?

In this paper we introduce two novel sampling techniques: a lexicon-based technique and a random walk technique. Our methods...







Tags:
google
howto
random
sampling
search
engine