Get Your Data Together! Algorithms for Managing Data Lakes

Subscribers:
344,000
Published on ● Video Link: https://www.youtube.com/watch?v=388w8lkP930



Duration: 58:31
1,089 views
25


Data lakes (e.g., enterprise data catalogs and Open Data portals) are data dumps if users cannot find and utilize the data in them. In this talk, I present two problems in massive, dynamic data lakes: 1) searching for joinable tables to discover potential linkages, and 2) joining tables from different sources through auto-generated syntactic transformation on join values. I will also present algorithmic solutions that can be used for data lakes that are large both in the number of tables (millions) and table sizes. The presented work has been published in SIGMOD and VLDB.

See more at https://www.microsoft.com/en-us/research/video/get-your-data-to…aging-data-lakes/




Other Videos By Microsoft Research


2019-04-22From Barriers to Bridges: Designing Infrastructures for Help in Online Programming Communities
2019-04-19Formal Design, Implementation and Verification of Blockchain Languages
2019-04-19Formalizing Teamwork in Human-Robot Interaction
2019-04-17Sensing Posture-Aware Pen+Touch Interaction on Tablets
2019-04-17AI for Earth with Dr. Lucas Joppa
2019-04-15Broad-Based Side-Channel Defenses for Modern Processor Architectures
2019-04-12Security for All: Modeling Structural Inequities to Design More Secure Systems
2019-04-12Learning in Data Scarce Visual and Multimodal Applications Using Vectorized
2019-04-10Holograms, spatial anchors and the future of computer vision with Dr. Marc Pollefeys
2019-04-08Better Apps: Delivering Universal UI Patterns as Web Components
2019-04-08Get Your Data Together! Algorithms for Managing Data Lakes
2019-04-04Rapidly Enabling Autonomy in Warehouses with High-Fidelity Simulations
2019-04-04High-Fidelity Simulations to enable Automated Wind Turbine Inspections
2019-04-03Enabling design with Ann Paradiso
2019-03-26Digital Sky
2019-03-26Robotic Lunar Exploration Missions
2019-03-26Gamifier Policies : A Tool for Creating a Holistic Healthcare Ecosystem
2019-03-26Panel Discussion: AI for Societal Impact
2019-03-26Atmos Realtime PM2.5 Air Quality for Citizen Science Monitoring
2019-03-26Storyweaver: Leveraging technology, collaboration and open content to create reading resources
2019-03-26KauwaKaate: A Platform for Fake-news Verification



Tags:
microsoft research