Disk-Based Parallel Computation, Rubik's Cube, and Checkpointing

Channel:

Google TechTalks

Subscribers:

349,000

Published on March 25, 2008 10:06:28 AM ● Video Link: https://www.youtube.com/watch?v=WQw7c-PliB4

Duration: 1:13:53

12,623 views

Google Tech Talks
March, 24 2008

ABSTRACT

This talk takes us on a journey through three varied, but interconnected
topics. First, our research lab has engaged in a series of disk-based
computations extending over five years. Disks have traditionally
been used for filesystems, for virtual memory, and for databases.
Disk-based computation opens up an important fourth use: an abstraction
for multiple disks that allows parallel programs to treat them in a
manner similar to RAM. The key observation is that 50 disks have
approximately the same parallel bandwidth as a _single_ RAM subsystem.
This leaves latency as the primary concern. A second key is the use
of techniques like delayed duplicate detection to avoid latency. For
example, hash accesses accesses can be saved (even saved on disk), until
there are sufficiently many pending accesses to use standard streaming
techniques. We have designed a library for search problems that exploits
the high parallel bandwidth while hiding the latency. We build
abstractions for search that employ parallel disk-based hash arrays
with the same speed as a single hash array in a single RAM subsystem.
In the case of Rubik's cube, we exploited this mechanism by using
seven terabytes of distributed disk in a search problem that showed
that 26 moves suffice to solve Rubik's cube. Our initial efforts
emphasize idempotent operations, so that we can easily recover from
hardware or software faults. We next intend to apply a more general
solution for fault recovery: checkpointing. This separate effort
in our lab has now produced a mature, robust user-level checkpointing
program has now matured. The package works successfully in tests
on OpenMPI, MPICH-2, OpenMP, and parallel iPython (used in SciPy and
NumPy). Our DMTCP package transparently checkpoints parallel,
multi-threaded processes, with no modification either to the
operating system or to the application binaries. Extrapolating
from current experiments, we estimate that we can checkpoint a 1,000
node parallel computation in a matter of minutes. We are currently
searching for a testbed on which to demonstrate this scalability.

Speaker: Gene Cooperman

Other Videos By Google TechTalks

2008-04-02	Coaching Series: What Tech Women Really Want
2008-04-02	Coaching Series: Create the Career You Want: A Non-Hyped App
2008-04-02	Faculty Summit
2008-03-29	Decayed MCMC for probabilistic filtering
2008-03-27	KNFB Reader, Talking OCR On Cell Phones
2008-03-27	What Do Those Images Have In Common?
2008-03-27	Movie/Script: Alignment and Parsing of Video and Text Transcription
2008-03-26	Scene Discovery by Matrix Factorization
2008-03-26	Rapid Prototyping of Ubiquitous Computing Applications: Tools & Frameworks
2008-03-26	Optimization for Machine Learning
2008-03-25	Disk-Based Parallel Computation, Rubik's Cube, and Checkpointing
2008-03-25	What Are FOSSBazaar and FOSSology and why should I care?
2008-03-22	Human Aspects of Software Engineering: Social and Cognitive Perspectives
2008-03-22	The timeless treasures in the modern world -- A life's path in paintings
2008-03-22	Robust Projected Clustering with P3C
2008-03-20	Improvement of Web Accessibility in Japan
2008-03-18	Forest-based Search Algorithms in Parsing and Machine Translation
2008-03-15	Our Future is Our History: Child-Rearing and the Roots of Violence
2008-03-15	Adaptive Algorithms for Online Optimization
2008-03-15	Come and meet Super Creators from Japan
2008-03-14	Visual Thinking with Graph Network

Tags:

google

techtalks

techtalk

engedu

talk

talks

googletechtalks

education

Channel	Latest
Skyprince777	13 hours ago
Tsubasa Yozora Ch.	13 hours ago
USIX Pro Gaming	14 hours ago
Arcade City	19 hours ago
alanzoka	20 hours ago
AnimeToons	20 hours ago
Flik's Gaming Stuff	21 hours ago
The Mexican Runner	22 hours ago
Beyond the Brick	22 hours ago
Spuffi	23 hours ago
442oons	1 day ago
Nintendo Life	1 day ago
Tamae	1 day ago
IntroGameOver	1 day ago
Dowell	1 day ago
Badaw Gaming	1 day ago
lugeyps3	1 day ago
CarbotAnimations	1 day ago
Pixelorez	1 day ago
Primal Koopa Pictures	1 day ago
BeastBoyShub	1 day ago
816	1 day ago
AoDzTo - อ๊อดโตะ	1 day ago
Chroma	1 day ago
Unnie Cj	1 day ago