NIPS 2011 Big Learning - Algorithms, Systems, & Tools Workshop: Spark: In-Memory Cluster...

Channel:

Google TechTalks

Subscribers:

348,000

Published on February 14, 2012 3:37:57 AM ● Video Link: https://www.youtube.com/watch?v=qLvLg-sqxKc

Duration: 40:52

12,419 views

Big Learning Workshop: Algorithms, Systems, and Tools for Learning at Scale at NIPS 2011
Invited Talk: Spark: In-Memory Cluster Computing for Iterative and Interactive Applications by Matei Zaharia

Matei Zaharia is a fifth year graduate student at UC Berkeley, working with Scott Shenker and Ion Stoica on topics in cloud computing, operating systems and networking. He is also a committer on Apache Hadoop. He is funded by a Google PhD fellowship. Before joining Berkeley, Matei got his undergraduate degree at the University of Waterloo in Canada.

Abstract: MapReduce and its variants have been highly successful in supporting large-scale data-intensive cluster applications. However, these systems are inefficient for applications that share data among multiple computation stages, including many machine learning algorithms, because they are based on an acyclic data flow model. We present Spark, a new cluster computing framework that extends the data flow model with a set of in-memory storage abstractions to efficiently support these applications. Spark outperforms Hadoop by up to 30x in iterative machine learning algorithms while retaining MapReduce's scalability and fault tolerance. In addition, Spark makes programming jobs easy by integrating into the Scala programming language. Finally, Spark's ability to load a dataset into memory and query it repeatedly makes it especially suitable for interactive analysis of big data. We have modified the Scala interpreter to make it possible to use Spark interactively as a highly responsive data analytics tool.

At Berkeley, we have used Spark to implement several large-scale machine learning applications, including a Twitter spam classifier and a real-time automobile traffic estimation system based on expectation maximization. We will present lessons learned from these applications and optimizations we added to Spark as a result.

Other Videos By Google TechTalks

2012-02-23	NIPS 2011 Big Learning - Algorithms, Systems, & Tools Workshop: Vowpal Wabbit Tutorial
2012-02-23	NIPS 2011 Sparse Representation & Low-rank Approximation Workshop: Group Sparse Hidden Markov...
2012-02-23	NIPS 2011 Big Learning - Algorithms, Systems, & Tools Workshop: A Common GPU...
2012-02-23	The Relative Happiness Index (RHI)
2012-02-23	A Chinese Typewriter in Silicon Valley
2012-02-23	3D Computer Vision: Past, Present, and Future
2012-02-20	Knowledge is... Love
2012-02-16	Meditate with Father Laurence Freeman
2012-02-14	Agile C++ with Supporting Eclipse CDT Plug-ins
2012-02-14	Santa Tracker - 1.6 Million Requests per Second
2012-02-13	NIPS 2011 Big Learning - Algorithms, Systems, & Tools Workshop: Spark: In-Memory Cluster...
2012-02-13	NIPS 2011 Big Learning - Algorithms, Systems, & Tools Workshop: Real time data...
2012-02-13	NIPS 2011 Big Learning - Algorithms, Systems, & Tools Workshop: Hazy - Making Data-driven...
2012-02-13	NIPS 2011 Big Learning - Algorithms, Systems, & Tools Workshop: Block splitting for...
2012-02-13	NIPS 2011 Big Learning - Algorithms, Systems, & Tools Workshop: No-U-Turn Sampler...
2012-02-13	NIPS 2011 Big Learning - Algorithms, Systems, & Tools Workshop: Graphlab 2...
2012-02-13	NIPS 2011 Big Learning - Algorithms, Systems, & Tools Workshop: Graphlab 2 Tutorial
2012-02-13	NIPS 2011 Big Learning - Algorithms, Systems, & Tools Workshop: Large-Scale Matrix...
2012-02-13	NIPS 2011 Big Learning - Algorithms, Systems, & Tools Workshop: Randomized Smoothing for...
2012-02-13	NIPS 2011 Big Learning - Algorithms, Systems, & Tools Workshop: Machine Learning's Role...
2012-02-13	NIPS 2011 Big Learning - Algorithms, Systems, & Tools Workshop: Fast Cross-Validation...

Tags:

new

bigml

zaharia

Channel	Latest
Bulkin	10 hours ago
Razor	10 hours ago
playtac	11 hours ago
diaeitsch	11 hours ago
Elgin	11 hours ago
Road to Darkness	11 hours ago
OverTake_gg	11 hours ago
RTV Dukagjini	11 hours ago
GasMaskJoker NedenHolePoker	11 hours ago
R-TAC & Daughters	12 hours ago
FC Rubin Kazan	12 hours ago
Kami Resse	12 hours ago
圍棋愛好者	12 hours ago
TS LAWYER GAMING	12 hours ago
さっくチャンネル	12 hours ago
Khartox	12 hours ago
EYETA	12 hours ago
ГЛАЗАСТАЯ МОРДА	12 hours ago
ivano h	12 hours ago
Papi Corse	12 hours ago
TheCantexGames	12 hours ago
Fenix Channel	12 hours ago
Juegos De MELVIN	12 hours ago
Biriki3	12 hours ago
VoitGG	12 hours ago