Classic Network Measurements meets Software Defined Networking

Size: px
Start display at page:

Download "Classic Network Measurements meets Software Defined Networking"

Transcription

1 Classic Network s meets Software Defined Networking Ran Ben Basat, Technion, Israel Joint work with Gil Einziger and Erez Waisbard (Nokia Bell Labs) Roy Friedman (Technion) and Marcello Luzieli (UFGRS) 1

2 Network s

3 Network s Elephant Flows Detection Averaging over Sliding Load Balancing Windows Traffic Engineering Link Utilization Caching Trend Detection Counting Distinct Elements DDoS Identification Worm Propagation Link-based SEO Estimating the fraction of rare flows Customer Satisfaction DDoS Detection Computing Quantiles Data Log Analysis Network Health Monitoring 3

4 Frequency Estimation How many packets has sent? Classic Setting: Can t allocate a counter for each flow! In software: Memory is cheap (to an extent) Speed is the new bottleneck 4

5 Agenda Classical Frequency Estimation Algorithms. Accelerating space-oriented algorithms. Hierarchical Heavy Hitters. 5

6 Count Min Sketch (Cormode and Muthukrishnan) 2 ε counters log δ 1 Rows E Noise Nε/2 Pr Noise > Nε 1/2 Query( )= 2 Overall Pr Noise > Nε δ 6

7 Space Saving (Metwally et al.) For a N-sized stream, when allocated with m counters, Space Saving guarantees: Query( ) returns 3 Query ( ) returns 2 f x Query x f x + N m For achieving an Nε approximation, 1/ε counters are needed. ID Value

8 Agenda Classical Frequency Estimation Algorithms. Accelerating space-oriented algorithms. Hierarchical Heavy Hitters. 8

9 Accelerating s Algorithms Add x, w V V + w With probability τ(w/v) Otherwise Add x with weight w τ 1 (w/v) Ignore packet algorithm Start with a N sized stream and a (ε, δ) measurement algorithm. We can iteratively accelerate ourselves! Accelerated Algorithm Get an O(logN ε 2 logδ) sized stream and a (2ε, 3δ) measurement algorithm. 9

10 Empirical Evaluation 10

11 Empirical Evaluation 11

12 Distributed SDN Implementation 12

13 Agenda Classical Frequency Estimation Algorithms. Accelerating space-oriented algorithms. Hierarchical Heavy Hitters. 13

14 Hierarchical Heavy Hitters Can we block only the attacking devices? 14

15 Multi-Dimensional Hierarchies

16 Hierarchical Heavy Hitters using Space Saving (Mitzenmacher et al.) Level0 Level1 Compute all prefixes *.*.*.* 181.*.*.* *.* * Level2 Level3 Level4 16

17 Constant Time Hierarchical Heavy Hitters (SIGCOMM2017) Level0 Level1 Compute a random prefix * Level2 Level3 Level4 17

18 Constant Time Hierarchical Heavy Hitters (SIGCOMM2017) Level0 Level With probability 9/10 Compute a random prefix * Level2 Level3 Ignore packet Level4 18

19 Constant Time Hierarchical Heavy Hitters (SIGCOMM2017) How to detect the minimal prefix networks? How long it takes to converge? 19

20 Constant Time Hierarchical Heavy Hitters (SIGCOMM2017) 20

21 Constant Time Hierarchical Heavy Hitters (SIGCOMM2017) 21

22 Constant Time Hierarchical Heavy Hitters (SIGCOMM2017) 22

23 Any Questions 23

arxiv: v3 [cs.ds] 15 Oct 2017

arxiv: v3 [cs.ds] 15 Oct 2017 Fast Flow Volume Estimation Ran Ben Basat Technion sran@cs.technion.ac.il Gil Einziger Nokia Bell Labs gil.einziger@nokia.com Roy Friedman Technion roy@cs.technion.ac.il arxiv:70.0355v3 [cs.ds] 5 Oct 07

More information

Network-Wide Routing-Oblivious Heavy Hitters

Network-Wide Routing-Oblivious Heavy Hitters Network-Wide Routing-Oblivious Heavy Hitters Ran Ben Basat Gil Einziger Shir Landau Feibish Technion sran@cs.technion.ac.il Nokia Bell Labs gil.einziger@nokia.com Princeton University sfeibish@cs.princeton.edu

More information

An Optimal Algorithm for l 1 -Heavy Hitters in Insertion Streams and Related Problems

An Optimal Algorithm for l 1 -Heavy Hitters in Insertion Streams and Related Problems An Optimal Algorithm for l 1 -Heavy Hitters in Insertion Streams and Related Problems Arnab Bhattacharyya, Palash Dey, and David P. Woodruff Indian Institute of Science, Bangalore {arnabb,palash}@csa.iisc.ernet.in

More information

Hierarchical Heavy Hitters with the Space Saving Algorithm

Hierarchical Heavy Hitters with the Space Saving Algorithm Hierarchical Heavy Hitters with the Space Saving Algorithm Michael Mitzenmacher Thomas Steinke Justin Thaler School of Engineering and Applied Sciences Harvard University, Cambridge, MA 02138 Email: {jthaler,

More information

12 Count-Min Sketch and Apriori Algorithm (and Bloom Filters)

12 Count-Min Sketch and Apriori Algorithm (and Bloom Filters) 12 Count-Min Sketch and Apriori Algorithm (and Bloom Filters) Many streaming algorithms use random hashing functions to compress data. They basically randomly map some data items on top of each other.

More information

arxiv: v3 [cs.ds] 24 Apr 2018

arxiv: v3 [cs.ds] 24 Apr 2018 Give Me Some Slack: Efficient Network Measurements Ran Ben Basat 1, Gil Einziger 2, and Roy Friedman 1 1 Department of Computer Science, Technion {sran,roy}@cs.technion.ac.il 2 Nokia Bell Labs gil.einziger@nokia.com

More information

11 Heavy Hitters Streaming Majority

11 Heavy Hitters Streaming Majority 11 Heavy Hitters A core mining problem is to find items that occur more than one would expect. These may be called outliers, anomalies, or other terms. Statistical models can be layered on top of or underneath

More information

B669 Sublinear Algorithms for Big Data

B669 Sublinear Algorithms for Big Data B669 Sublinear Algorithms for Big Data Qin Zhang 1-1 2-1 Part 1: Sublinear in Space The model and challenge The data stream model (Alon, Matias and Szegedy 1996) a n a 2 a 1 RAM CPU Why hard? Cannot store

More information

Approximate counting: count-min data structure. Problem definition

Approximate counting: count-min data structure. Problem definition Approximate counting: count-min data structure G. Cormode and S. Muthukrishhan: An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms 55 (2005) 58-75. Problem

More information

Finding Frequent Items in Probabilistic Data

Finding Frequent Items in Probabilistic Data Finding Frequent Items in Probabilistic Data Qin Zhang, Hong Kong University of Science & Technology Feifei Li, Florida State University Ke Yi, Hong Kong University of Science & Technology SIGMOD 2008

More information

Heavy Hitters. Piotr Indyk MIT. Lecture 4

Heavy Hitters. Piotr Indyk MIT. Lecture 4 Heavy Hitters Piotr Indyk MIT Last Few Lectures Recap (last few lectures) Update a vector x Maintain a linear sketch Can compute L p norm of x (in zillion different ways) Questions: Can we do anything

More information

Title: Count-Min Sketch Name: Graham Cormode 1 Affil./Addr. Department of Computer Science, University of Warwick,

Title: Count-Min Sketch Name: Graham Cormode 1 Affil./Addr. Department of Computer Science, University of Warwick, Title: Count-Min Sketch Name: Graham Cormode 1 Affil./Addr. Department of Computer Science, University of Warwick, Coventry, UK Keywords: streaming algorithms; frequent items; approximate counting, sketch

More information

Improved Algorithms for Distributed Entropy Monitoring

Improved Algorithms for Distributed Entropy Monitoring Improved Algorithms for Distributed Entropy Monitoring Jiecao Chen Indiana University Bloomington jiecchen@umail.iu.edu Qin Zhang Indiana University Bloomington qzhangcs@indiana.edu Abstract Modern data

More information

What s New: Finding Significant Differences in Network Data Streams

What s New: Finding Significant Differences in Network Data Streams What s New: Finding Significant Differences in Network Data Streams Graham Cormode S. Muthukrishnan Abstract Monitoring and analyzing network traffic usage patterns is vital for managing IP Networks. An

More information

Part 1: Hashing and Its Many Applications

Part 1: Hashing and Its Many Applications 1 Part 1: Hashing and Its Many Applications Sid C-K Chau Chi-Kin.Chau@cl.cam.ac.u http://www.cl.cam.ac.u/~cc25/teaching Why Randomized Algorithms? 2 Randomized Algorithms are algorithms that mae random

More information

Optimal Elephant Flow Detection

Optimal Elephant Flow Detection Optimal Elephant Flow Detection Ran Ben Basat Computer Science Department Technion sran@cs.technion.ac.il Gil Einziger Nokia Bell Labs gil.einziger@nokia.com Roy Friedman Computer Science Department Technion

More information

Effective computation of biased quantiles over data streams

Effective computation of biased quantiles over data streams Effective computation of biased quantiles over data streams Graham Cormode cormode@bell-labs.com S. Muthukrishnan muthu@cs.rutgers.edu Flip Korn flip@research.att.com Divesh Srivastava divesh@research.att.com

More information

15-451/651: Design & Analysis of Algorithms September 13, 2018 Lecture #6: Streaming Algorithms last changed: August 30, 2018

15-451/651: Design & Analysis of Algorithms September 13, 2018 Lecture #6: Streaming Algorithms last changed: August 30, 2018 15-451/651: Design & Analysis of Algorithms September 13, 2018 Lecture #6: Streaming Algorithms last changed: August 30, 2018 Today we ll talk about a topic that is both very old (as far as computer science

More information

Tracking Distributed Aggregates over Time-based Sliding Windows

Tracking Distributed Aggregates over Time-based Sliding Windows Tracking Distributed Aggregates over Time-based Sliding Windows Graham Cormode and Ke Yi Abstract. The area of distributed monitoring requires tracking the value of a function of distributed data as new

More information

Computing the Entropy of a Stream

Computing the Entropy of a Stream Computing the Entropy of a Stream To appear in SODA 2007 Graham Cormode graham@research.att.com Amit Chakrabarti Dartmouth College Andrew McGregor U. Penn / UCSD Outline Introduction Entropy Upper Bound

More information

HeavyKeeper: An Accurate Algorithm for Finding Top-k Elephant Flows

HeavyKeeper: An Accurate Algorithm for Finding Top-k Elephant Flows : An Accurate Algorithm for Finding Top-k Elephant Flows Junzhi Gong Peking University Tong Yang* Peking University Steve Uhlig UK & Queen Mary, University of London Lorna Uden Staffordshire University

More information

Lecture and notes by: Alessio Guerrieri and Wei Jin Bloom filters and Hashing

Lecture and notes by: Alessio Guerrieri and Wei Jin Bloom filters and Hashing Bloom filters and Hashing 1 Introduction The Bloom filter, conceived by Burton H. Bloom in 1970, is a space-efficient probabilistic data structure that is used to test whether an element is a member of

More information

Succinct Approximate Counting of Skewed Data

Succinct Approximate Counting of Skewed Data Succinct Approximate Counting of Skewed Data David Talbot Google Inc., Mountain View, CA, USA talbot@google.com Abstract Practical data analysis relies on the ability to count observations of objects succinctly

More information

Distributed Summaries

Distributed Summaries Distributed Summaries Graham Cormode graham@research.att.com Pankaj Agarwal(Duke) Zengfeng Huang (HKUST) Jeff Philips (Utah) Zheiwei Wei (HKUST) Ke Yi (HKUST) Summaries Summaries allow approximate computations:

More information

Linear Sketches A Useful Tool in Streaming and Compressive Sensing

Linear Sketches A Useful Tool in Streaming and Compressive Sensing Linear Sketches A Useful Tool in Streaming and Compressive Sensing Qin Zhang 1-1 Linear sketch Random linear projection M : R n R k that preserves properties of any v R n with high prob. where k n. M =

More information

Estimating Frequencies and Finding Heavy Hitters Jonas Nicolai Hovmand, Morten Houmøller Nygaard,

Estimating Frequencies and Finding Heavy Hitters Jonas Nicolai Hovmand, Morten Houmøller Nygaard, Estimating Frequencies and Finding Heavy Hitters Jonas Nicolai Hovmand, 2011 3884 Morten Houmøller Nygaard, 2011 4582 Master s Thesis, Computer Science June 2016 Main Supervisor: Gerth Stølting Brodal

More information

The Count-Min-Sketch and its Applications

The Count-Min-Sketch and its Applications The Count-Min-Sketch and its Applications Jannik Sundermeier Abstract In this thesis, we want to reveal how to get rid of a huge amount of data which is at least dicult or even impossible to store in local

More information

PAPER Adaptive Bloom Filter : A Space-Efficient Counting Algorithm for Unpredictable Network Traffic

PAPER Adaptive Bloom Filter : A Space-Efficient Counting Algorithm for Unpredictable Network Traffic IEICE TRANS.??, VOL.Exx??, NO.xx XXXX x PAPER Adaptive Bloom Filter : A Space-Efficient Counting Algorithm for Unpredictable Network Traffic Yoshihide MATSUMOTO a), Hiroaki HAZEYAMA b), and Youki KADOBAYASHI

More information

Data Streams & Communication Complexity

Data Streams & Communication Complexity Data Streams & Communication Complexity Lecture 1: Simple Stream Statistics in Small Space Andrew McGregor, UMass Amherst 1/25 Data Stream Model Stream: m elements from universe of size n, e.g., x 1, x

More information

Modeling Residual-Geometric Flow Sampling

Modeling Residual-Geometric Flow Sampling Modeling Residual-Geometric Flow Sampling Xiaoming Wang Joint work with Xiaoyong Li and Dmitri Loguinov Amazon.com Inc., Seattle, WA April 13 th, 2011 1 Agenda Introduction Underlying model of residual

More information

Wavelet decomposition of data streams. by Dragana Veljkovic

Wavelet decomposition of data streams. by Dragana Veljkovic Wavelet decomposition of data streams by Dragana Veljkovic Motivation Continuous data streams arise naturally in: telecommunication and internet traffic retail and banking transactions web server log records

More information

Block Heavy Hitters Alexandr Andoni, Khanh Do Ba, and Piotr Indyk

Block Heavy Hitters Alexandr Andoni, Khanh Do Ba, and Piotr Indyk Computer Science and Artificial Intelligence Laboratory Technical Report -CSAIL-TR-2008-024 May 2, 2008 Block Heavy Hitters Alexandr Andoni, Khanh Do Ba, and Piotr Indyk massachusetts institute of technology,

More information

Leveraging Big Data: Lecture 13

Leveraging Big Data: Lecture 13 Leveraging Big Data: Lecture 13 http://www.cohenwang.com/edith/bigdataclass2013 Instructors: Edith Cohen Amos Fiat Haim Kaplan Tova Milo What are Linear Sketches? Linear Transformations of the input vector

More information

HeavyGuardian: Separate and Guard Hot Items in Data Streams

HeavyGuardian: Separate and Guard Hot Items in Data Streams : Separate and Guard Hot Items in Data Streams Tong Yang Peking University Junzhi Gong Peking University Haowei Zhang Peking University Lei Zou Peking University ABSTRACT 1 Data stream processing is a

More information

Streaming and communication complexity of Hamming distance

Streaming and communication complexity of Hamming distance Streaming and communication complexity of Hamming distance Tatiana Starikovskaya IRIF, Université Paris-Diderot (Joint work with Raphaël Clifford, ICALP 16) Approximate pattern matching Problem Pattern

More information

Sparse analysis Lecture V: From Sparse Approximation to Sparse Signal Recovery

Sparse analysis Lecture V: From Sparse Approximation to Sparse Signal Recovery Sparse analysis Lecture V: From Sparse Approximation to Sparse Signal Recovery Anna C. Gilbert Department of Mathematics University of Michigan Connection between... Sparse Approximation and Compressed

More information

Verifying Computations in the Cloud (and Elsewhere) Michael Mitzenmacher, Harvard University Work offloaded to Justin Thaler, Harvard University

Verifying Computations in the Cloud (and Elsewhere) Michael Mitzenmacher, Harvard University Work offloaded to Justin Thaler, Harvard University Verifying Computations in the Cloud (and Elsewhere) Michael Mitzenmacher, Harvard University Work offloaded to Justin Thaler, Harvard University Goals of Verifiable Computation Provide user with correctness

More information

A Deterministic Algorithm for Summarizing Asynchronous Streams over a Sliding Window

A Deterministic Algorithm for Summarizing Asynchronous Streams over a Sliding Window A Deterministic Algorithm for Summarizing Asynchronous Streams over a Sliding Window Costas Busch 1 and Srikanta Tirthapura 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY

More information

On Differentially Private Longest Increasing Subsequence Computation in Data Stream

On Differentially Private Longest Increasing Subsequence Computation in Data Stream 73 100 On Differentially Private Longest Increasing Subsequence Computation in Data Stream Luca Bonomi 1,, Li Xiong 2 1 University of California San Diego, La Jolla, CA. 2 Emory University, Atlanta, GA.

More information

Optimal Random Sampling from Distributed Streams Revisited

Optimal Random Sampling from Distributed Streams Revisited Optimal Random Sampling from Distributed Streams Revisited Srikanta Tirthapura 1 and David P. Woodruff 2 1 Dept. of ECE, Iowa State University, Ames, IA, 50011, USA. snt@iastate.edu 2 IBM Almaden Research

More information

Acceptably inaccurate. Probabilistic data structures

Acceptably inaccurate. Probabilistic data structures Acceptably inaccurate Probabilistic data structures Hello Today's talk Motivation Bloom filters Count-Min Sketch HyperLogLog Motivation Tape HDD SSD Memory Tape HDD SSD Memory Speed Tape HDD SSD Memory

More information

Data Stream Methods. Graham Cormode S. Muthukrishnan

Data Stream Methods. Graham Cormode S. Muthukrishnan Data Stream Methods Graham Cormode graham@dimacs.rutgers.edu S. Muthukrishnan muthu@cs.rutgers.edu Plan of attack Frequent Items / Heavy Hitters Counting Distinct Elements Clustering items in Streams Motivating

More information

Estimating Dominance Norms of Multiple Data Streams

Estimating Dominance Norms of Multiple Data Streams Estimating Dominance Norms of Multiple Data Streams Graham Cormode and S. Muthukrishnan Abstract. There is much focus in the algorithms and database communities on designing tools to manage and mine data

More information

Communication-Efficient Computation on Distributed Noisy Datasets. Qin Zhang Indiana University Bloomington

Communication-Efficient Computation on Distributed Noisy Datasets. Qin Zhang Indiana University Bloomington Communication-Efficient Computation on Distributed Noisy Datasets Qin Zhang Indiana University Bloomington SPAA 15 June 15, 2015 1-1 Model of computation The coordinator model: k sites and 1 coordinator.

More information

A Mergeable Summaries

A Mergeable Summaries A Mergeable Summaries Pankaj K. Agarwal, Graham Cormode, Zengfeng Huang, Jeff M. Phillips, Zhewei Wei, and Ke Yi We study the mergeability of data summaries. Informally speaking, mergeability requires

More information

A Comparative Study of Two Network-based Anomaly Detection Methods

A Comparative Study of Two Network-based Anomaly Detection Methods A Comparative Study of Two Network-based Anomaly Detection Methods Kaustubh Nyalkalkar, Sushant Sinha, Michael Bailey and Farnam Jahanian Electrical Engineering and Computer Science, University of Michigan,

More information

Pay for a Sliding Bloom Filter and Get Counting, Distinct Elements, and Entropy for Free

Pay for a Sliding Bloom Filter and Get Counting, Distinct Elements, and Entropy for Free Pay for a Sliding Bloom Filter and Get Counting, Distinct Elements, and Entropy for Free Eran Assaf Hebrew University Ran Ben Basat Technion Gil Einziger Nokia Bell Labs Roy Friedman Technion Abstract

More information

CR-precis: A deterministic summary structure for update data streams

CR-precis: A deterministic summary structure for update data streams CR-precis: A deterministic summary structure for update data streams Sumit Ganguly 1 and Anirban Majumder 2 1 Indian Institute of Technology, Kanpur 2 Lucent Technologies, Bangalore Abstract. We present

More information

Large-Scale Behavioral Targeting

Large-Scale Behavioral Targeting Large-Scale Behavioral Targeting Ye Chen, Dmitry Pavlov, John Canny ebay, Yandex, UC Berkeley (This work was conducted at Yahoo! Labs.) June 30, 2009 Chen et al. (KDD 09) Large-Scale Behavioral Targeting

More information

The Bloom Paradox: When not to Use a Bloom Filter

The Bloom Paradox: When not to Use a Bloom Filter 1 The Bloom Paradox: When not to Use a Bloom Filter Ori Rottenstreich and Isaac Keslassy Abstract In this paper, we uncover the Bloom paradox in Bloom filters: sometimes, the Bloom filter is harmful and

More information

First-Order DPA Attack Against AES in Counter Mode w/ Unknown Counter. DPA Attack, typical structure

First-Order DPA Attack Against AES in Counter Mode w/ Unknown Counter. DPA Attack, typical structure Josh Jaffe CHES 2007 Cryptography Research, Inc. www.cryptography.com 575 Market St., 21 st Floor, San Francisco, CA 94105 1998-2007 Cryptography Research, Inc. Protected under issued and/or pending US

More information

Anomaly Extraction in Backbone Networks using Association Rules

Anomaly Extraction in Backbone Networks using Association Rules 1 Anomaly Extraction in Backbone Networks using Association Rules Daniela Brauckhoff, Xenofontas Dimitropoulos, Arno Wagner, and Kave Salamatian TIK-Report 309 ftp://ftp.tik.ee.ethz.ch/pub/publications/tik-report-309.pdf

More information

A Mergeable Summaries

A Mergeable Summaries A Mergeable Summaries Pankaj K. Agarwal, Graham Cormode, Zengfeng Huang, Jeff M. Phillips, Zhewei Wei, and Ke Yi We study the mergeability of data summaries. Informally speaking, mergeability requires

More information

Continuous Matrix Approximation on Distributed Data

Continuous Matrix Approximation on Distributed Data Continuous Matrix Approximation on Distributed Data Mina Ghashami School of Computing University of Utah ghashami@cs.uah.edu Jeff M. Phillips School of Computing University of Utah jeffp@cs.uah.edu Feifei

More information

Introduction to Randomized Algorithms III

Introduction to Randomized Algorithms III Introduction to Randomized Algorithms III Joaquim Madeira Version 0.1 November 2017 U. Aveiro, November 2017 1 Overview Probabilistic counters Counting with probability 1 / 2 Counting with probability

More information

Tight Bounds for Sliding Bloom Filters

Tight Bounds for Sliding Bloom Filters Tight Bounds for Sliding Bloom Filters Moni Naor Eylon Yogev November 7, 2013 Abstract A Bloom filter is a method for reducing the space (memory) required for representing a set by allowing a small error

More information

Interconnect Power and Delay Optimization by Dynamic Programming in Gridded Design Rules

Interconnect Power and Delay Optimization by Dynamic Programming in Gridded Design Rules Interconnect Power and Delay Optimization by Dynamic Programming in Gridded Design Rules Konstantin Moiseev, Avinoam Kolodny EE Dept. Technion, Israel Institute of Technology Shmuel Wimer Eng. School,

More information

Estimating Dominance Norms of Multiple Data Streams Graham Cormode Joint work with S. Muthukrishnan

Estimating Dominance Norms of Multiple Data Streams Graham Cormode Joint work with S. Muthukrishnan Estimating Dominance Norms of Multiple Data Streams Graham Cormode graham@dimacs.rutgers.edu Joint work with S. Muthukrishnan Data Stream Phenomenon Data is being produced faster than our ability to process

More information

Identifying Global Icebergs in Distributed Streams

Identifying Global Icebergs in Distributed Streams Identifying Global Icebergs in Distributed Streams Emmanuelle Anceaume, Yann Busnel, Nicolò Rivetti, Bruno Sericola To cite this version: Emmanuelle Anceaume, Yann Busnel, Nicolò Rivetti, Bruno Sericola.

More information

Time-Decayed Correlated Aggregates over Data Streams

Time-Decayed Correlated Aggregates over Data Streams Time-Decayed Correlated Aggregates over Data Streams Graham Cormode AT&T Labs Research graham@research.att.com Srikanta Tirthapura Bojian Xu ECE Dept., Iowa State University {snt,bojianxu}@iastate.edu

More information

Min Congestion Control for High- Speed Heterogeneous Networks. JetMax: Scalable Max-Min

Min Congestion Control for High- Speed Heterogeneous Networks. JetMax: Scalable Max-Min JetMax: Scalable Max-Min Min Congestion Control for High- Speed Heterogeneous Networks Yueping Zhang Joint work with Derek Leonard and Dmitri Loguinov Internet Research Lab Department of Computer Science

More information

CS 598CSC: Algorithms for Big Data Lecture date: Sept 11, 2014

CS 598CSC: Algorithms for Big Data Lecture date: Sept 11, 2014 CS 598CSC: Algorithms for Big Data Lecture date: Sept 11, 2014 Instructor: Chandra Cheuri Scribe: Chandra Cheuri The Misra-Greis deterministic counting guarantees that all items with frequency > F 1 /

More information

Methodology for Computer Science Research Lecture 4: Mathematical Modeling

Methodology for Computer Science Research Lecture 4: Mathematical Modeling Methodology for Computer Science Research Andrey Lukyanenko Department of Computer Science and Engineering Aalto University, School of Science and Technology andrey.lukyanenko@tkk.fi Definitions and Goals

More information

A Simpler and More Efficient Deterministic Scheme for Finding Frequent Items over Sliding Windows

A Simpler and More Efficient Deterministic Scheme for Finding Frequent Items over Sliding Windows A Simpler and More Efficient Deterministic Scheme for Finding Frequent Items over Sliding Windows ABSTRACT L.K. Lee Department of Computer Science The University of Hong Kong Pokfulam, Hong Kong lklee@cs.hku.hk

More information

Tight Bounds for Distributed Functional Monitoring

Tight Bounds for Distributed Functional Monitoring Tight Bounds for Distributed Functional Monitoring Qin Zhang MADALGO, Aarhus University Joint with David Woodruff, IBM Almaden NII Shonan meeting, Japan Jan. 2012 1-1 The distributed streaming model (a.k.a.

More information

Window-aware Load Shedding for Aggregation Queries over Data Streams

Window-aware Load Shedding for Aggregation Queries over Data Streams Window-aware Load Shedding for Aggregation Queries over Data Streams Nesime Tatbul Stan Zdonik Talk Outline Background Load shedding in Aurora Windowed aggregation queries Window-aware load shedding Experimental

More information

A Near-Optimal Algorithm for Computing the Entropy of a Stream

A Near-Optimal Algorithm for Computing the Entropy of a Stream A Near-Optimal Algorithm for Computing the Entropy of a Stream Amit Chakrabarti ac@cs.dartmouth.edu Graham Cormode graham@research.att.com Andrew McGregor andrewm@seas.upenn.edu Abstract We describe a

More information

An Integrated Efficient Solution for Computing Frequent and Top-k Elements in Data Streams

An Integrated Efficient Solution for Computing Frequent and Top-k Elements in Data Streams An Integrated Efficient Solution for Computing Frequent and Top-k Elements in Data Streams AHMED METWALLY, DIVYAKANT AGRAWAL, and AMR EL ABBADI University of California, Santa Barbara We propose an approximate

More information

An Early Traffic Sampling Algorithm

An Early Traffic Sampling Algorithm An Early Traffic Sampling Algorithm Hou Ying ( ), Huang Hai, Chen Dan, Wang ShengNan, and Li Peng National Digital Switching System Engineering & Techological R&D center, ZhengZhou, 450002, China ndschy@139.com,

More information

Time-Decaying Aggregates in Out-of-order Streams

Time-Decaying Aggregates in Out-of-order Streams DIMACS Technical Report 2007-10 Time-Decaying Aggregates in Out-of-order Streams by Graham Cormode 1 AT&T Labs Research graham@research.att.com Flip Korn 2 AT&T Labs Research flip@research.att.com Srikanta

More information

26 Mergeable Summaries

26 Mergeable Summaries 26 Mergeable Summaries PANKAJ K. AGARWAL, Duke University GRAHAM CORMODE, University of Warwick ZENGFENG HUANG, Aarhus University JEFF M. PHILLIPS, University of Utah ZHEWEI WEI, Aarhus University KE YI,

More information

Resource Allocation for Video Streaming in Wireless Environment

Resource Allocation for Video Streaming in Wireless Environment Resource Allocation for Video Streaming in Wireless Environment Shahrokh Valaee and Jean-Charles Gregoire Abstract This paper focuses on the development of a new resource allocation scheme for video streaming

More information

arxiv: v1 [cs.lg] 7 Nov 2017

arxiv: v1 [cs.lg] 7 Nov 2017 Finding Heavily-Weighted Features in Data Streams Kai Sheng Tai, Vatsal Sharan, Peter Bailis, Gregory Valiant {kst, vsharan, pbailis, valiant}@cs.stanford.edu Stanford University arxiv:1711.02305v1 [cs.lg]

More information

Statistical modelling of TV interference for shared-spectrum devices

Statistical modelling of TV interference for shared-spectrum devices Statistical modelling of TV interference for shared-spectrum devices Industrial mathematics sktp Project Jamie Fairbrother Keith Briggs Wireless Research Group BT Technology, Service & Operations St. Catherine

More information

Lecture 1: Introduction to Sublinear Algorithms

Lecture 1: Introduction to Sublinear Algorithms CSE 522: Sublinear (and Streaming) Algorithms Spring 2014 Lecture 1: Introduction to Sublinear Algorithms March 31, 2014 Lecturer: Paul Beame Scribe: Paul Beame Too much data, too little time, space for

More information

A fast randomized algorithm for approximating an SVD of a matrix

A fast randomized algorithm for approximating an SVD of a matrix A fast randomized algorithm for approximating an SVD of a matrix Joint work with Franco Woolfe, Edo Liberty, and Vladimir Rokhlin Mark Tygert Program in Applied Mathematics Yale University Place July 17,

More information

An Optimal Algorithm for l 1 -Heavy Hitters in Insertion Streams and Related Problems

An Optimal Algorithm for l 1 -Heavy Hitters in Insertion Streams and Related Problems An Optimal Algorithm for l 1 -Heavy Hitters in Insertion Streams and Related Problems Arnab Bhattacharyya Indian Institute of Science, Bangalore arnabb@csa.iisc.ernet.in Palash Dey Indian Institute of

More information

Differential Privacy and Pan-Private Algorithms. Cynthia Dwork, Microsoft Research

Differential Privacy and Pan-Private Algorithms. Cynthia Dwork, Microsoft Research Differential Privacy and Pan-Private Algorithms Cynthia Dwork, Microsoft Research A Dream? C? Original Database Sanitization Very Vague AndVery Ambitious Census, medical, educational, financial data, commuting

More information

Maximum Likelihood Estimation of the Flow Size Distribution Tail Index from Sampled Packet Data

Maximum Likelihood Estimation of the Flow Size Distribution Tail Index from Sampled Packet Data Maximum Likelihood Estimation of the Flow Size Distribution Tail Index from Sampled Packet Data Patrick Loiseau 1, Paulo Gonçalves 1, Stéphane Girard 2, Florence Forbes 2, Pascale Vicat-Blanc Primet 1

More information

COMP9334: Capacity Planning of Computer Systems and Networks

COMP9334: Capacity Planning of Computer Systems and Networks COMP9334: Capacity Planning of Computer Systems and Networks Week 2: Operational analysis Lecturer: Prof. Sanjay Jha NETWORKS RESEARCH GROUP, CSE, UNSW Operational analysis Operational: Collect performance

More information

Algorithms for Querying Noisy Distributed/Streaming Datasets

Algorithms for Querying Noisy Distributed/Streaming Datasets Algorithms for Querying Noisy Distributed/Streaming Datasets Qin Zhang Indiana University Bloomington Sublinear Algo Workshop @ JHU Jan 9, 2016 1-1 The big data models The streaming model (Alon, Matias

More information

Prompt Network Anomaly Detection using SSA-Based Change-Point Detection. Hao Chen 3/7/2014

Prompt Network Anomaly Detection using SSA-Based Change-Point Detection. Hao Chen 3/7/2014 Prompt Network Anomaly Detection using SSA-Based Change-Point Detection Hao Chen 3/7/2014 Network Anomaly Detection Network Intrusion Detection (NID) Signature-based detection Detect known attacks Use

More information

To Randomize or Not To

To Randomize or Not To To Randomize or Not To Randomize: Space Optimal Summaries for Hyperlink Analysis Tamás Sarlós, Eötvös University and Computer and Automation Institute, Hungarian Academy of Sciences Joint work with András

More information

1. Introduction. CONTINUOUS MONITORING OF DISTRIBUTED DATA STREAMS OVER A TIME-BASED SLIDING WINDOW arxiv: v2 [cs.

1. Introduction. CONTINUOUS MONITORING OF DISTRIBUTED DATA STREAMS OVER A TIME-BASED SLIDING WINDOW arxiv: v2 [cs. Symposium on Theoretical Aspects of Computer Science 2010 (Nancy, France), pp. 179-190 www.stacs-conf.org CONTINUOUS MONITORING OF DISTRIBUTED DATA STREAMS OVER A TIME-BASED SLIDING WINDOW arxiv:0912.4569v2

More information

Efficient Sketches for Earth-Mover Distance, with Applications

Efficient Sketches for Earth-Mover Distance, with Applications Efficient Sketches for Earth-Mover Distance, with Applications Alexandr Andoni MIT andoni@mit.edu Khanh Do Ba MIT doba@mit.edu Piotr Indyk MIT indyk@mit.edu David Woodruff IBM Almaden dpwoodru@us.ibm.com

More information

Observed structure of addresses in IP traffic

Observed structure of addresses in IP traffic Observed structure of addresses in IP traffic Eddie Kohler, Jinyang Li, Vern Paxson, Scott Shenker ICSI Center for Internet Research Thanks to David Donoho and Dick Karp Problem How can we model the set

More information

14.1 Finding frequent elements in stream

14.1 Finding frequent elements in stream Chapter 14 Streaming Data Model 14.1 Finding frequent elements in stream A very useful statistics for many applications is to keep track of elements that occur more frequently. It can come in many flavours

More information

OECD QSAR Toolbox v.3.4

OECD QSAR Toolbox v.3.4 OECD QSAR Toolbox v.3.4 Predicting developmental and reproductive toxicity of Diuron (CAS 330-54-1) based on DART categorization tool and DART SAR model Outlook Background Objectives The exercise Workflow

More information

Randomized Selection on the GPU. Laura Monroe, Joanne Wendelberger, Sarah Michalak Los Alamos National Laboratory

Randomized Selection on the GPU. Laura Monroe, Joanne Wendelberger, Sarah Michalak Los Alamos National Laboratory Randomized Selection on the GPU Laura Monroe, Joanne Wendelberger, Sarah Michalak Los Alamos National Laboratory High Performance Graphics 2011 August 6, 2011 Top k Selection on GPU Output the top k keys

More information

.Cycle counting: the next generation. Matthew Skala 30 January 2013

.Cycle counting: the next generation. Matthew Skala 30 January 2013 .Cycle counting: the next generation δ ζ µ β ι θ γ η ε λ κ σ ρ α o Matthew Skala 30 January 2013 Outline Cycle counting ECCHI Knight s Tours Equivalent circuits The next generation Cycle counting Let G

More information

Count-Min Tree Sketch: Approximate counting for NLP

Count-Min Tree Sketch: Approximate counting for NLP Count-Min Tree Sketch: Approximate counting for NLP Guillaume Pitel, Geoffroy Fouquier, Emmanuel Marchand and Abdul Mouhamadsultane exensa firstname.lastname@exensa.com arxiv:64.5492v [cs.ir] 9 Apr 26

More information

Space efficient tracking of persistent items in a massive data stream

Space efficient tracking of persistent items in a massive data stream Electrical and Computer Engineering Publications Electrical and Computer Engineering 2014 Space efficient tracking of persistent items in a massive data stream Impetus Technologies Srikanta Tirthapura

More information

Administrivia. Course Objectives. Overview. Lecture Notes Week markem/cs333/ 2. Staff. 3. Prerequisites. 4. Grading. 1. Theory and application

Administrivia. Course Objectives. Overview. Lecture Notes Week markem/cs333/ 2. Staff. 3. Prerequisites. 4. Grading. 1. Theory and application Administrivia 1. markem/cs333/ 2. Staff 3. Prerequisites 4. Grading Course Objectives 1. Theory and application 2. Benefits 3. Labs TAs Overview 1. What is a computer system? CPU PC ALU System bus Memory

More information

Identifying and Estimating Persistent Items in Data Streams

Identifying and Estimating Persistent Items in Data Streams 1 Identifying and Estimating Persistent Items in Data Streams Haipeng Dai Muhammad Shahzad Alex X. Liu Meng Li Yuankun Zhong Guihai Chen Abstract This paper addresses the fundamental problem of finding

More information

Probabilistic Models. Models describe how (a portion of) the world works

Probabilistic Models. Models describe how (a portion of) the world works Probabilistic Models Models describe how (a portion of) the world works Models are always simplifications May not account for every variable May not account for all interactions between variables All models

More information

Table I: Symbols and descriptions

Table I: Symbols and descriptions 2017 IEEE 33rd International Conference on Data Engineering Tracking Matrix Approximation over Distributed Sliding Windows Haida Zhang* Zengfeng Huang* School of CSE, UNSW {haida.zhang, zengfeng.huang}@unsw.edu.au

More information

Simple and Deterministic Matrix Sketches

Simple and Deterministic Matrix Sketches Simple and Deterministic Matrix Sketches Edo Liberty + ongoing work with: Mina Ghashami, Jeff Philips and David Woodruff. Edo Liberty: Simple and Deterministic Matrix Sketches 1 / 41 Data Matrices Often

More information

Bayes Nets: Independence

Bayes Nets: Independence Bayes Nets: Independence [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at http://ai.berkeley.edu.] Bayes Nets A Bayes

More information

A Continuous Sampling from Distributed Streams

A Continuous Sampling from Distributed Streams A Continuous Sampling from Distributed Streams Graham Cormode, AT&T Labs Research S. Muthukrishnan, Rutgers University Ke Yi, HKUST Qin Zhang, MADALGO, Aarhus University A fundamental problem in data management

More information

Space-optimal Heavy Hitters with Strong Error Bounds

Space-optimal Heavy Hitters with Strong Error Bounds Space-optimal Heavy Hitters with Strong Error Bounds Graham Cormode graham@research.att.com Radu Berinde(MIT) Piotr Indyk(MIT) Martin Strauss (U. Michigan) The Frequent Items Problem TheFrequent Items

More information