Data Streaming Algorithms for Efficient and Accurate Estimation of Flow Size Distribution

Subscribers:
344,000
Published on ● Video Link: https://www.youtube.com/watch?v=ZGELY7D0e9A



Duration: 59:26
405 views
1


Knowing the distribution of the sizes of traffic flows passing through a network link helps a network operator to characterize network resource usage, infer traffic demands, detect traffic anomalies, and accommodate new traffic demands through better traffic engineering. Previous work on estimating the flow size distribution has been focused on making inferences from sampled network traffic. Its accuracy is limited by the (typically) low sampling rate required to make the sampling operation affordable. In this paper we present a novel data streaming algorithm to provide much more accurate estimates of flow distribution, using a ``lossy data structure'' which consists of an array of counters fitted well into SRAM. For each incoming packet, our algorithm only needs to increment one underlying counter, making the algorithm fast enough even for 40 Gbps (OC-768) links. The data structure is lossy in the sense that sizes of multiple flows may collide into the same counter. Our algorithm uses Bayesian statistical methods such as Expectation Maximization to infer the most likely flow size distribution that results in the observed counter values after collision. Evaluations of this algorithm on large Internet traces obtained from several sources (including a tier-1 ISP) demonstrate that it has very high measurement accuracy (within 2\). Our algorithm not only dramatically improves the accuracy of flow distribution measurement, but also contributes to the field of data streaming by formalizing an existing methodology and applying it to the context of estimating the flow-distribution.




Other Videos By Microsoft Research


2016-09-05TQFTs and tight contact structures on 3-manifolds      
2016-09-05Wireless Embedded Networks/The Ecosystem and Cool Challenges
2016-09-05Data Mining & Machine Learning to empower business strategy
2016-09-05Some uses of orthogonal polynomials
2016-09-05Approximation Algorithms for Embedding with Extra Information and Ordinal Relaxation
2016-09-05Evaluating Retrieval System Effectiveness
2016-09-05Exploiting the Transients of Adaptation for RoQ Attacks on Internet Resources
2016-09-05Specification-Based Annotation Inference
2016-09-05Emotion Recognition in Speech Signal: Experimental Study, Development and Applications
2016-09-05Text summarization: News and Beyond
2016-09-05Data Streaming Algorithms for Efficient and Accurate Estimation of Flow Size Distribution
2016-09-05Raising the Bar: Integrity and Passion in Life and Business: The Story of Clif Bar, Inc.
2016-09-05Revelationary Computing, Proactive Displays and The Experience UbiComp Project
2016-09-05The Design of A Formal Property-Specification Language
2016-09-05Data Harvesting: A Random Coding Approach to Rapid Dissemination and Efficient Storage of Data
2016-09-05Runtime Refinement Checking for Concurrent Data Structures
2016-09-05Lost in Space: The Fall of NASA and the Dream of a New Space Age
2016-09-05Solving Geometric Matching Problems using Interval Arithmetic Optimization
2016-09-05How to Disembed a Program
2016-09-05Laboratory for Recognition and Organization of Speech
2016-09-05The (Mis)Behavior of Markets: A Fractal View of Risk, Ruin and Return



Tags:
microsoft research