Get Your Data Together! Algorithms for Managing Data Lakes
Channel:
Subscribers:
344,000
Published on ● Video Link: https://www.youtube.com/watch?v=388w8lkP930
Data lakes (e.g., enterprise data catalogs and Open Data portals) are data dumps if users cannot find and utilize the data in them. In this talk, I present two problems in massive, dynamic data lakes: 1) searching for joinable tables to discover potential linkages, and 2) joining tables from different sources through auto-generated syntactic transformation on join values. I will also present algorithmic solutions that can be used for data lakes that are large both in the number of tables (millions) and table sizes. The presented work has been published in SIGMOD and VLDB.
See more at https://www.microsoft.com/en-us/research/video/get-your-data-to…aging-data-lakes/
Other Videos By Microsoft Research
Tags:
microsoft research