High-Throughput Data-Intensive Computing: Shared-Scan Scheduling in Scientific Databases & the Cloud

Channel:

Subscribers:

344,000

Published on August 17, 2016 2:22:25 AM ● Video Link: https://www.youtube.com/watch?v=NwwWnryd5Z8

Duration: 1:00:30

189 views

Data-intensive computing consists of batch-processing workloads that scan massive data sets in parallel. The focus on data access, data movement, data ingest, and data production means that these workloads overwhelm the network and I/O capabilities of data centers and supercomputers. Major improvements in throughput are available by co-scheduling tasks that access the same data so that multiple tasks complete processing based on accessing and transferring the data a single time. Multiple tasks share I/O, network data transfer, cache space, and even computing with SIMD or vector processing. This talk will review the evolution of co-scheduling in data-intensive computing systems, including shared-scan scheduling for map/reduce workloads (Agrawal et al., VLDB 2008), data-driven batch processing for scientific databases (LifeRaft and JAWS), shared streaming-I/O for spatial workloads, and shared join processing for Pig programs and Nova workflows.

Other Videos By Microsoft Research

2016-08-16	Developing cloud enabled applications for WP7 using Hawaii SDK
2016-08-16	Butterfly Analysis: Adapting Dataflow Analysis to Dynamic Parallel Monitoring
2016-08-16	Mini-course around Event-B and Rodin
2016-08-16	How to force unsupervised neural networks to discover the right representation of images
2016-08-16	On Optimal Multidimensional Mechanism Design
2016-08-16	Mini-course around Event-B and Rodin, lecture 3
2016-08-16	Intelligent Multicore Processor for Attention-based Object Recognition
2016-08-16	Immersive Visual Communication with Depth
2016-08-16	Applications 4
2016-08-16	CFA2: Pushdown Flow Analysis for Higher-Order Languages
2016-08-16	High-Throughput Data-Intensive Computing: Shared-Scan Scheduling in Scientific Databases & the Cloud
2016-08-16	Grace: a new object-oriented educational programming language
2016-08-16	Extreme Learning Machine: Why Tuning Is Not Required in Learning?
2016-08-16	Systems 3
2016-08-16	Security and Software Development
2016-08-16	Identifying Peer Effects through Randomized Trials in Networks
2016-08-16	Social networks, institutions, and the process of globalization
2016-08-16	Keynote - Inside Windows Azure, MicrosoftΓÇÖs Cloud OS
2016-08-16	The TPTP World ΓÇô Infrastructure for Automated Reasoning
2016-08-16	Competitive Contagion in Networks
2016-08-16	Mini-course around Event-B and Rodin, lecture 2

Tags:

microsoft research

Channel	Latest
Waleczny Bigos	6 hours ago
rynogt4	6 hours ago
Pittsbersk Gaming	6 hours ago
Rasplin	6 hours ago
ArtizzzMlbb	6 hours ago
Bumpy McSquigums	6 hours ago
Blanc IT	6 hours ago
Daenerys & Mama	6 hours ago
guilhermeoss	7 hours ago
Smokers 97	7 hours ago
EricVanWilderman	7 hours ago
GamePhantom	7 hours ago
UdinMauluddin	7 hours ago
krisfire98	7 hours ago
ProbIems	7 hours ago
The Game Archivist	7 hours ago
Keith Ballard	7 hours ago
off chat MLBB	7 hours ago
Niks Playz	7 hours ago
Atas Gudang	7 hours ago
Sebuah Kanal Gim	7 hours ago
KeysJore	7 hours ago
tron howardLTP	7 hours ago
Mini Map MLBB	8 hours ago
Ake Gaming	8 hours ago