Data Harvesting: A Random Coding Approach to Rapid Dissemination and Efficient Storage of Data

Subscribers:
349,000
Published on ● Video Link: https://www.youtube.com/watch?v=QGmPy-BZqBw



Duration: 1:03:02
25 views
0


In this talk, we will see how Random Linear Coding (RLC) based protocols can provide huge gains in two somewhat related problems in large distributed systems: the problem of disseminating information rapidly in a decentralized manner, and the problem of efficiently storing a large file in a distributed manner. We will start with the problem of information dissemination, which will be the primary focus of the talk. The following general setting will be considered. There are N nodes in the network and there are K distinct messages spread in system initially, but not all nodes have all the messages. The question we ask is: how quickly can we disseminate all the K messages among all the nodes? For fully-connected graphs with point-to-point gossip based communication, we will show that the time to disseminate the messages with an RLC based protocol is order optimal in the regime K=O(N). Simulation results, demonstrating the large gains to be had by using RLC based protocols for simultaneous dissemination of messages, will be shown for different network topologies under point-to-point and point-to-multipoint communication. We will then touch on an RLC based strategy for storing a large file in a distributed manner. In the framework we consider, there are many storage locations, each of which only has very limited storage space. Each storage location chooses a part (or a coded version of the parts) of the file without the knowledge of what is stored in the other locations. We will show that, with RLC based storage, the minimum number of storage locations a downloader needs to connect to (for reconstructing the entire file), can be very close to the case where there is complete coordination between the storage locations and the downloader.




Other Videos By Microsoft Research


2016-09-05Evaluating Retrieval System Effectiveness
2016-09-05Exploiting the Transients of Adaptation for RoQ Attacks on Internet Resources
2016-09-05Specification-Based Annotation Inference
2016-09-05Emotion Recognition in Speech Signal: Experimental Study, Development and Applications
2016-09-05Text summarization: News and Beyond
2016-09-05Data Streaming Algorithms for Efficient and Accurate Estimation of Flow Size Distribution
2016-09-05Learning and Inferring Transportation Routines
2016-09-05Raising the Bar: Integrity and Passion in Life and Business: The Story of Clif Bar, Inc.
2016-09-05Revelationary Computing, Proactive Displays and The Experience UbiComp Project
2016-09-05The Design of A Formal Property-Specification Language
2016-09-05Data Harvesting: A Random Coding Approach to Rapid Dissemination and Efficient Storage of Data
2016-09-05Lost in Space: The Fall of NASA and the Dream of a New Space Age
2016-09-05Solving Geometric Matching Problems using Interval Arithmetic Optimization
2016-09-05How to Disembed a Program
2016-09-05Laboratory for Recognition and Organization of Speech
2016-09-05The (Mis)Behavior of Markets: A Fractal View of Risk, Ruin and Return
2016-09-05Uncovering Semantic Similarities between Query Terms
2016-09-0550/50 by 2020 -- Living Anita's vision and the importance of gender equity in technology
2016-09-05Online Auctions, Strategyproofness and Random Valuations
2016-09-05Citrine Smart Clipboard, WhyLine Interrogative Debugging, EdgeWrite Text Entry, and Pebbles PocketPC
2016-09-05Because it is there: Kili the Right Way



Tags:
microsoft research