DIFF: A Relational Interface for Large-Scale Data Explanation

Channel:

Subscribers:

351,000

Published on October 1, 2019 12:40:37 AM ● Video Link: https://www.youtube.com/watch?v=dWEvtuxqbfk

Duration: 23:42

536 views

A range of explanation engines assist data analysts by performing feature selection over increasingly high-volume and high-dimensional data, grouping and highlighting commonalities among data points. While useful in diverse tasks such as user behavior analytics, operational event processing, and root cause analysis, today’s explanation engines are designed as standalone data processing tools that do not interoperate with traditional, SQL-based analytics workﬂows; this limits the applicability and extensibility of these engines. In response, we propose the DIFF operator, a relational aggregation operator that uniﬁes the core functionality of these engines with declarative relational query processing. We implement both single-node and distributed versions of the DIFF operator in MB SQL, an extension of MacroBase, and demonstrate how DIFF can provide the same semantics as existing explanation engines while capturing a broad set of production use cases in industry, including at Microsoft and Facebook. Additionally, we illustrate how this declarative approach to data explanation enables new logical and physical query optimizations. We evaluate these optimizations on several real-world production applications, and ﬁnd that DIFF in MB SQL can outperform state-of-the-art engines by up to an order of magnitude.

This is joint work with Peter Kraft, Sahaana Suri, Edward Gan, Eric Xu, Atul Shenoy†, Asvin Ananthanarayan†, John Sheu†, Erik Meijer‡, Xi Wu§, Jeff Naughton§, Peter Bailis, Matei Zaharia at Stanford, Facebook (‡), Google (§), Microsoft (†).

Talk slides: https://www.microsoft.com/en-us/research/uploads/prod/2019/09/DIFF-A-Relational-Interface-for-Large-Scale-Data-Explanation-SLIDES.pdf

Learn more about this and other talks at Microsoft Research: https://www.microsoft.com/en-us/research/video/diff-a-relational-interface-for-large-scale-data-explanation/

Other Videos By Microsoft Research

2019-10-07	Tea: A High-level Language and Runtime System for Automating Statistical Analysis [Python module]
2019-10-07	Discover[i]: Component-based Parameterized Reasoning for Distributed Applications
2019-10-04	Scheduling For Efficient Large-Scale Machine Learning Training
2019-10-03	Distributed Entity Resolution for Computational Social Science
2019-10-03	MMLSpark: empowering AI for Good with Mark Hamilton [Podcast]
2019-10-02	Non-linear Invariants for Control-Command Systems
2019-10-02	Vision-and-Dialog Navigation
2019-10-01	The Future of Mathematics?
2019-09-30	How Not to Prove Your Election Outcome
2019-09-30	The Worst Form Including All Those Others: Canada’s Experiments with Online Voting
2019-09-30	DIFF: A Relational Interface for Large-Scale Data Explanation
2019-09-30	A Calculus for Brain Computation
2019-09-26	Decoding Multisensory Attention from Electroencephalography for Use in a Brain-Computer Interface
2019-09-26	A Short Introduction to DIMACS & DIMACS and MSR-NYC
2019-09-26	Boosting Innovation and Discovery of Ideas
2019-09-26	Resource-Efficient Redundancy for Large-Scale Data Processing and Storage Systems
2019-09-26	Optimizing Declarative Graph Queries at Large Scale
2019-09-25	SILK: Preventing Latency Spikes in Log-Structured Merge Key-Value Stores
2019-09-25	Coverage Guided, Property Based Testing
2019-09-25	Efficient Robot Skill Learning: Grounded Simulation Learning and Imitation Learning from Observation
2019-09-25	Towards Secure and Interpretable AI: Scalable Methods, Interactive Visualizations, & Practical Tools

Tags:

explanation engines

data analysis

high-dimensional data

high-volume data

DIFF operator

relational aggregation operator

Large-Scale Data Explanation

data explanation

DIFF

Microsoft Research

Firas Abuzaid

MSR

Channel	Latest
Simple Gamer	6 hours ago
RedCaio	6 hours ago
ROXMAN GAMING	6 hours ago
Dem2006	6 hours ago
A TUTTO CALCIO⚽	6 hours ago
Haloist	6 hours ago
SKILL DOWN1982	6 hours ago
Zaxx Gaming	6 hours ago
Mystère Alex	6 hours ago
LEO DESANDE E ANA CLÁUDIA	6 hours ago
Starzkil1z	6 hours ago
rickX lods official	6 hours ago
WraggyTheGamer	6 hours ago
Böröcz "DeadFox" Bence	6 hours ago
Joey Fernandez	6 hours ago
六神说漫	6 hours ago
Ananda Husain	6 hours ago
Drachinifel	6 hours ago
UmmeBlox	6 hours ago
Hutton	6 hours ago
CANAL DO MARCIO 🎮🕹	6 hours ago
なすななし	6 hours ago
COSEF NASTYA	6 hours ago
Elykhull	6 hours ago
จุ่มค่ะ มากับนุ่นแล้วก็มากับโบว์	6 hours ago