Classic Network Measurements meets Software Defined Networking

Similar documents
arxiv: v3 [cs.ds] 15 Oct 2017

Network-Wide Routing-Oblivious Heavy Hitters

An Optimal Algorithm for l 1 -Heavy Hitters in Insertion Streams and Related Problems

Hierarchical Heavy Hitters with the Space Saving Algorithm

12 Count-Min Sketch and Apriori Algorithm (and Bloom Filters)

arxiv: v3 [cs.ds] 24 Apr 2018

11 Heavy Hitters Streaming Majority

B669 Sublinear Algorithms for Big Data

Approximate counting: count-min data structure. Problem definition

Finding Frequent Items in Probabilistic Data

Heavy Hitters. Piotr Indyk MIT. Lecture 4

Title: Count-Min Sketch Name: Graham Cormode 1 Affil./Addr. Department of Computer Science, University of Warwick,

Improved Algorithms for Distributed Entropy Monitoring

What s New: Finding Significant Differences in Network Data Streams

Part 1: Hashing and Its Many Applications

Optimal Elephant Flow Detection

Effective computation of biased quantiles over data streams

15-451/651: Design & Analysis of Algorithms September 13, 2018 Lecture #6: Streaming Algorithms last changed: August 30, 2018

Tracking Distributed Aggregates over Time-based Sliding Windows

Computing the Entropy of a Stream

HeavyKeeper: An Accurate Algorithm for Finding Top-k Elephant Flows

Lecture and notes by: Alessio Guerrieri and Wei Jin Bloom filters and Hashing

Succinct Approximate Counting of Skewed Data

Distributed Summaries

Linear Sketches A Useful Tool in Streaming and Compressive Sensing

Estimating Frequencies and Finding Heavy Hitters Jonas Nicolai Hovmand, Morten Houmøller Nygaard,

The Count-Min-Sketch and its Applications

PAPER Adaptive Bloom Filter : A Space-Efficient Counting Algorithm for Unpredictable Network Traffic

Data Streams & Communication Complexity

Modeling Residual-Geometric Flow Sampling

Wavelet decomposition of data streams. by Dragana Veljkovic

Block Heavy Hitters Alexandr Andoni, Khanh Do Ba, and Piotr Indyk

Leveraging Big Data: Lecture 13

HeavyGuardian: Separate and Guard Hot Items in Data Streams

Streaming and communication complexity of Hamming distance

Sparse analysis Lecture V: From Sparse Approximation to Sparse Signal Recovery

Verifying Computations in the Cloud (and Elsewhere) Michael Mitzenmacher, Harvard University Work offloaded to Justin Thaler, Harvard University

A Deterministic Algorithm for Summarizing Asynchronous Streams over a Sliding Window

On Differentially Private Longest Increasing Subsequence Computation in Data Stream

Optimal Random Sampling from Distributed Streams Revisited

Acceptably inaccurate. Probabilistic data structures

Data Stream Methods. Graham Cormode S. Muthukrishnan

Estimating Dominance Norms of Multiple Data Streams

Communication-Efficient Computation on Distributed Noisy Datasets. Qin Zhang Indiana University Bloomington

A Mergeable Summaries

A Comparative Study of Two Network-based Anomaly Detection Methods

Pay for a Sliding Bloom Filter and Get Counting, Distinct Elements, and Entropy for Free

CR-precis: A deterministic summary structure for update data streams

Large-Scale Behavioral Targeting

The Bloom Paradox: When not to Use a Bloom Filter

First-Order DPA Attack Against AES in Counter Mode w/ Unknown Counter. DPA Attack, typical structure

Anomaly Extraction in Backbone Networks using Association Rules

A Mergeable Summaries

Continuous Matrix Approximation on Distributed Data

Introduction to Randomized Algorithms III

Tight Bounds for Sliding Bloom Filters

Interconnect Power and Delay Optimization by Dynamic Programming in Gridded Design Rules

Estimating Dominance Norms of Multiple Data Streams Graham Cormode Joint work with S. Muthukrishnan

Identifying Global Icebergs in Distributed Streams

Time-Decayed Correlated Aggregates over Data Streams

Min Congestion Control for High- Speed Heterogeneous Networks. JetMax: Scalable Max-Min

CS 598CSC: Algorithms for Big Data Lecture date: Sept 11, 2014

Methodology for Computer Science Research Lecture 4: Mathematical Modeling

A Simpler and More Efficient Deterministic Scheme for Finding Frequent Items over Sliding Windows

Tight Bounds for Distributed Functional Monitoring

Window-aware Load Shedding for Aggregation Queries over Data Streams

A Near-Optimal Algorithm for Computing the Entropy of a Stream

An Integrated Efficient Solution for Computing Frequent and Top-k Elements in Data Streams

An Early Traffic Sampling Algorithm

Time-Decaying Aggregates in Out-of-order Streams

26 Mergeable Summaries

Resource Allocation for Video Streaming in Wireless Environment

arxiv: v1 [cs.lg] 7 Nov 2017

Statistical modelling of TV interference for shared-spectrum devices

Lecture 1: Introduction to Sublinear Algorithms

A fast randomized algorithm for approximating an SVD of a matrix

An Optimal Algorithm for l 1 -Heavy Hitters in Insertion Streams and Related Problems

Differential Privacy and Pan-Private Algorithms. Cynthia Dwork, Microsoft Research

Maximum Likelihood Estimation of the Flow Size Distribution Tail Index from Sampled Packet Data

COMP9334: Capacity Planning of Computer Systems and Networks

Algorithms for Querying Noisy Distributed/Streaming Datasets

Prompt Network Anomaly Detection using SSA-Based Change-Point Detection. Hao Chen 3/7/2014

To Randomize or Not To

1. Introduction. CONTINUOUS MONITORING OF DISTRIBUTED DATA STREAMS OVER A TIME-BASED SLIDING WINDOW arxiv: v2 [cs.

Efficient Sketches for Earth-Mover Distance, with Applications

Observed structure of addresses in IP traffic

14.1 Finding frequent elements in stream

OECD QSAR Toolbox v.3.4

Randomized Selection on the GPU. Laura Monroe, Joanne Wendelberger, Sarah Michalak Los Alamos National Laboratory

.Cycle counting: the next generation. Matthew Skala 30 January 2013

Count-Min Tree Sketch: Approximate counting for NLP

Space efficient tracking of persistent items in a massive data stream

Administrivia. Course Objectives. Overview. Lecture Notes Week markem/cs333/ 2. Staff. 3. Prerequisites. 4. Grading. 1. Theory and application

Identifying and Estimating Persistent Items in Data Streams

Probabilistic Models. Models describe how (a portion of) the world works

Table I: Symbols and descriptions

Simple and Deterministic Matrix Sketches

Bayes Nets: Independence

A Continuous Sampling from Distributed Streams

Space-optimal Heavy Hitters with Strong Error Bounds

Transcription:

Classic Network s meets Software Defined Networking Ran Ben Basat, Technion, Israel Joint work with Gil Einziger and Erez Waisbard (Nokia Bell Labs) Roy Friedman (Technion) and Marcello Luzieli (UFGRS) 1

Network s 220.6.67.9 181.7.20.6 188.13.7.2 181.7.20.6 220.6.67.9 188.13.7.2 181.7.20.6 2

Network s Elephant Flows Detection Averaging over Sliding Load Balancing Windows Traffic Engineering Link Utilization Caching Trend Detection Counting Distinct Elements DDoS Identification Worm Propagation Link-based SEO Estimating the fraction of rare flows Customer Satisfaction DDoS Detection Computing Quantiles Data Log Analysis Network Health Monitoring 3

Frequency Estimation How many packets has sent? Classic Setting: Can t allocate a counter for each flow! In software: Memory is cheap (to an extent) Speed is the new bottleneck 4

Agenda Classical Frequency Estimation Algorithms. Accelerating space-oriented algorithms. Hierarchical Heavy Hitters. 5

Count Min Sketch (Cormode and Muthukrishnan) 2 ε counters 0 0 0 0 0 0 0 0 0 0 0 0 21 0 1 0 0 0 0 0 21 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 32 0 0 0 0 0 0 0 1 0 0 0 0 0 21 0 0 0 0 21 0 0 0 0 0 1 0 0 0 0 0 0 log δ 1 Rows E Noise Nε/2 Pr Noise > Nε 1/2 Query( )= 2 Overall Pr Noise > Nε δ 6

Space Saving (Metwally et al.) For a N-sized stream, when allocated with m counters, Space Saving guarantees: Query( ) returns 3 Query ( ) returns 2 f x Query x f x + N m For achieving an Nε approximation, 1/ε counters are needed. ID Value 32 01 20 1 7

Agenda Classical Frequency Estimation Algorithms. Accelerating space-oriented algorithms. Hierarchical Heavy Hitters. 8

Accelerating s Algorithms Add x, w V V + w With probability τ(w/v) Otherwise Add x with weight w τ 1 (w/v) Ignore packet algorithm Start with a N sized stream and a (ε, δ) measurement algorithm. We can iteratively accelerate ourselves! Accelerated Algorithm Get an O(logN ε 2 logδ) sized stream and a (2ε, 3δ) measurement algorithm. 9

Empirical Evaluation 10

Empirical Evaluation 11

Distributed SDN Implementation 12

Agenda Classical Frequency Estimation Algorithms. Accelerating space-oriented algorithms. Hierarchical Heavy Hitters. 13

Hierarchical Heavy Hitters 181.7.20.1 181.7.20.2 181.7.21.1 181.7.21.2 Can we block only the attacking devices? 14

Multi-Dimensional Hierarchies 181.7.. 181.7.20. 181.7.20.13 220.7.16. 220.7.16.9 15

Hierarchical Heavy Hitters using Space Saving (Mitzenmacher et al.) Level0 Level1 Compute all prefixes *.*.*.* 181.*.*.* 181.7.*.* 181.7.20.6 181.7.20.* Level2 Level3 Level4 16

Constant Time Hierarchical Heavy Hitters (SIGCOMM2017) Level0 Level1 Compute a random prefix 181.7.20.* Level2 Level3 Level4 17

Constant Time Hierarchical Heavy Hitters (SIGCOMM2017) Level0 Level1 181.7.20.3 181.7.20.6 188.3.12.3 188.67.7.1 92.67.7.81 181.7.20.2 With probability 9/10 Compute a random prefix 181.7.20.* Level2 Level3 Ignore packet Level4 18

Constant Time Hierarchical Heavy Hitters (SIGCOMM2017) How to detect the minimal prefix networks? How long it takes to converge? 19

Constant Time Hierarchical Heavy Hitters (SIGCOMM2017) 20

Constant Time Hierarchical Heavy Hitters (SIGCOMM2017) 21

Constant Time Hierarchical Heavy Hitters (SIGCOMM2017) 22

Any Questions 23