variance of independent variables: sum of variances So chebyshev predicts won t stray beyond stdev.

Similar documents
Second main application of Chernoff: analysis of load balancing. Already saw balls in bins example. oblivious algorithms only consider self packet.

Approximation Algorithms

Tail Inequalities Randomized Algorithms. Sariel Har-Peled. December 20, 2002

Part 2: Random Routing and Load Balancing

Lecture 4. P r[x > ce[x]] 1/c. = ap r[x = a] + a>ce[x] P r[x = a]

CS261: A Second Course in Algorithms Lecture #18: Five Essential Tools for the Analysis of Randomized Algorithms

X = X X n, + X 2

Randomized Algorithms I. Spring 2013, period III Jyrki Kivinen

Randomized Load Balancing:The Power of 2 Choices

Lecture 4: Sampling, Tail Inequalities

Chapter 4 - Tail inequalities

Tail Inequalities. The Chernoff bound works for random variables that are a sum of indicator variables with the same distribution (Bernoulli trials).

Homework 4 Solutions

CS5314 Randomized Algorithms. Lecture 15: Balls, Bins, Random Graphs (Hashing)

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20

compare to comparison and pointer based sorting, binary trees

Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds

11.1 Set Cover ILP formulation of set cover Deterministic rounding

Expectation, inequalities and laws of large numbers

Time Synchronization

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 2 Luca Trevisan August 29, 2017

Lecture 4: Two-point Sampling, Coupon Collector s problem

Advanced Algorithms (2IL45)

Approximate Counting and Markov Chain Monte Carlo

Assignment 4: Solutions

1 Review of The Learning Setting

Dominating Set. Chapter 7

Balls & Bins. Balls into Bins. Revisit Birthday Paradox. Load. SCONE Lab. Put m balls into n bins uniformly at random

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation

Non-Interactive Zero Knowledge (II)

Problem 1: (Chernoff Bounds via Negative Dependence - from MU Ex 5.15)

ECEN 689 Special Topics in Data Science for Communications Networks

Topic: Sampling, Medians of Means method and DNF counting Date: October 6, 2004 Scribe: Florin Oprea

Disjointness and Additivity

Big Data. Big data arises in many forms: Common themes:

Lecture 5. 1 Review (Pairwise Independence and Derandomization)

Midterm 2 Review. CS70 Summer Lecture 6D. David Dinh 28 July UC Berkeley

Solution Set for Homework #1

Discrete Probability Distributions

Lecture 18: March 15

Intractable Problems Part Two

CSC2420 Fall 2012: Algorithm Design, Analysis and Theory Lecture 9

Lecture 2 Sep 5, 2017

Kousha Etessami. U. of Edinburgh, UK. Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 1 / 13

Pr[C = c M = m] = Pr[C = c] Pr[M = m] Pr[M = m C = c] = Pr[M = m]

Discrete Mathematics for CS Spring 2006 Vazirani Lecture 22

Stochastic Models in Computer Science A Tutorial

The Monte Carlo Method

Common Discrete Distributions

CS6999 Probabilistic Methods in Integer Programming Randomized Rounding Andrew D. Smith April 2003

Online Packet Routing on Linear Arrays and Rings

COMP Analysis of Algorithms & Data Structures

Dissertation Defense

Lecture 24: April 12

Proving the central limit theorem

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 3 Luca Trevisan August 31, 2017

CS 6820 Fall 2014 Lectures, October 3-20, 2014

Dominating Set. Chapter 26

Dominating Set. Chapter Sequential Greedy Algorithm 294 CHAPTER 26. DOMINATING SET

Tom Salisbury

Notes on MapReduce Algorithms

Probability Models in Electrical and Computer Engineering Mathematical models as tools in analysis and design Deterministic models Probability models

Chernoff Bounds. Theme: try to show that it is unlikely a random variable X is far away from its expectation.

Provable Approximation via Linear Programming

Convergence of Random Processes

The first bound is the strongest, the other two bounds are often easier to state and compute. Proof: Applying Markov's inequality, for any >0 we have

In a five-minute period, you get a certain number m of requests. Each needs to be served from one of your n servers.

The space complexity of approximating the frequency moments

Input-queued switches: Scheduling algorithms for a crossbar switch. EE 384X Packet Switch Architectures 1

Expectation of geometric distribution

Randomized Algorithms Multiple Choice Test

Introduction to Series and Sequences Math 121 Calculus II Spring 2015

Solutions to Problem Set 4

Discrete Mathematics and Probability Theory Fall 2012 Vazirani Note 14. Random Variables: Distribution and Expectation

1 Approximate Quantiles and Summaries

STAT 200C: High-dimensional Statistics

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3)

1 Randomized Computation

Homework 3 solution (100points) Due in class, 9/ (10) 1.19 (page 31)

1 Difference between grad and undergrad algorithms

CS Communication Complexity: Applications and New Directions

Lecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1

Random Graphs. 7.1 Introduction

Approximation Algorithms (Load Balancing)

Randomized Algorithms Week 2: Tail Inequalities

Randomized Algorithms

CS 580: Algorithm Design and Analysis

a 2n = . On the other hand, the subsequence a 2n+1 =

A Tight Upper Bound on the Second-Order Coding Rate of Parallel Gaussian Channels with Feedback

Distributed Distance-Bounded Network Design Through Distributed Convex Programming

Partitioning Metric Spaces

Topic 3 Random variables, expectation, and variance, II

Induced subgraphs with many repeated degrees

Stanford University CS254: Computational Complexity Handout 8 Luca Trevisan 4/21/2010

Random Walks on graphs

Sliding Windows with Limited Storage

Minimizing Average Latency in Oblivious Routing

NICTA Short Course. Network Analysis. Vijay Sivaraman. Day 1 Queueing Systems and Markov Chains. Network Analysis, 2008s2 1-1

ACM 116: Lectures 3 4

Transcription:

Announcements No class monday. Metric embedding seminar. Review expectation notion of high probability. Markov. Today: Book 4.1, 3.3, 4.2 Chebyshev. Remind variance, standard deviation. σ 2 = E[(X µ X ) 2 ] X E[XY ] = E[X]E[Y ] if independent variance of independent variables: sum of variances Pr[ X µ tσ] = Pr[(X µ) 2 t 2 σ 2 ] 1/t 2 So chebyshev predicts won t stray beyond stdev. binomial distribution. variance np(1 p). stdev n. requires (only) a mean and variance. less applicable but more powerful than markov Balls in bins: err 1/ ln 2 n. Real applications later. 1

Chernoff Bound Intro Markov: Pr[f (X) > z] < E[f (X)]/z. Chebyshev used X 2 in f other functions yield other bounds Chernoff most popular Theorem: Let X i poisson (ie independent 0/1) trials, E[ X i ] = µ Proof. [ ] ɛ µ e Pr[X > (1 + ɛ)µ] <. (1 + ɛ) (1+ɛ) note independent of n, exponential in µ. For any t > 0, Pr[X > (1 + ɛ)µ] = Pr[exp(tX) > exp(t(1 + ɛ)µ)] E[exp(tX)] < exp(t(1 + ɛ)µ) Use independence. E[exp(tX)] = E[exp(tX i )] E[exp(tX i )] = p i e t + (1 p i ) t exp(p i (e t 1)) = exp(µ(e 1)) So overall bound is = 1 + p i (e t 1) exp((e t 1)µ) exp(t(1 + ɛ)µ) exp(p i (e t 1)) True for any t. To minimize, plug in t = ln(1 + ɛ). 2

Simpler bounds: less than e µɛ2 /3 for ɛ < 1 less than e µɛ2 /4 for ɛ < 2e 1. Less than 2 (1+ɛ)µ for larger ɛ. By same argument on exp( tx), bound by e ɛ2 /2. Basic application: cn log n balls in c bins. max matches average a fortiori for n balss in n bins General observations: [ ] µ e ɛ Pr[X < (1 ɛ)µ] < (1 ɛ) (1 ɛ) Bound trails off when ɛ 1/ µ, ie absolute error µ no surprise, since standard deviation is around µ (recall chebyshev) If µ = Ω(log n), probability of constant ɛ deviation is O(1/n), Useful if polynomial number of events. Note similarito to Gaussian distribution. Generalizes: bound applies to any vars distributed in range [0, 1]. Zillions of Chernoff applications. 3

Median finding. First main application of Chernoff: Random Sampling List L median of sample looks like median of whole. neighborhood. analysis via Chernoff bound Algorithm choose s samples with replacement take fences before and after sample median keep items between fences. sort. Analysis claim (i) median within fences and (ii) few items between fences. Without loss of generality, L contains 1,..., n. (ok for comparison based algorithm) Samples s 1,..., s m in sorted order. lemma: S r near rn/s. Expected number preceding k is ks/n. Chernoff: w.h.p., k, number elements before k is (1±ɛ k )ks/n, where ɛ k = (6n ln n)/ks. Thus, when k > n/4, have ɛ k ɛ = 24 ln n/s) Write ɛ = 24 ln n/s. S (1+ɛ)ks/n > k S r > rn/s(1 + ɛ) S r < rn/s(1 ɛ). s Let r 0 = 2 (1 ɛ) n Then w.h.p., 2 (1 ɛ)/(1 + ɛ) < S r0 < n/2 s Let r 1 = 2 (1 ɛ) Then S r1 > n/2 4

But S r1 S r0 = O(ɛn) Number of elements to sort: s Set containing median: O(ɛn) = O(n (log n)/s). balance: O(log(n 2/3 )) in both steps. Randomized is strictly better: Gives important constant factor improvement Optimum deterministic: (2 + ɛ)n Optimum randomized: (3/2)n + o(n) Book analysis slightly different. Routing Second main application of Chernoff: analysis of load balancing. Already saw balls in bins example synchronous message passing bidirectional links, one message per step queues on links permutation routing oblivious algorithms only consider self packet. Theorem Any deterministic oblivious permutation routing requires Ω( N/d) steps on an N node degree d machine. reason: some edge has lots of paths through it. homework: special case Hypercube. N nodes, n = log 2 N dimensions 5

N n directed edges bit representation natural routing: bit fixing (left to right) paths of length n lower bound on routing time N n edges for N length n paths suggest no congestion bound but deterministic bound Ω( N/n) Randomized routing algorithm: O(n) = O(log N ) randomized how? load balance paths. First idea: random destination (not permutation!), bit correction Average case, but a good start. T (e i ) = number of paths using e i by symmetry, all E[T (e i )] equal expected path length n/2 LOE: expected total path length N n/2 nn edges in hypercube Deduce E[T (e i )] = 1/2 Chernoff: every edge gets 3n (prob 1 1/N ) Naive usage: n phases, one per bit 3n time per phase O(n 2 ) total Worst case destinations Idea [Valiant-Brebner] From intermediate destination, route back! routes any permutation in O(n 2 ) expected time. what s going in with N/n lower bound? 6

Adversary doesn t know our routing so cannot plan worst permutation What if don t wait for next phase? FIFO queuing total time is length plus delay Expected delay E[ T (e l )] = n/2. Chernoff bound? no. dependence of T (e i ). High prob. bound: consider paths sharing i s fixed route (e 0,..., e k ) Suppose S packets intersect route (use at least one of e r ) claim delay S Suppose true, and let H ij = 1 if j hits i s (fixed) route. E[delay] E[ H ij ] E[ T (e l )] n/2 Now Chernoff does apply (H ij independent for fixed i-route). S = O(n) w.p. 1 2 5n, so O(n) delay for all 2 n paths. Lag argument Exercise: once packets separate, don t rejoin Route for i is ρ i = (e 1,..., e k ) charge each delay to a departure of a packet from ρ i. Packet waiting to follow e j at time t has: Lag t j Delay of i is lag crossing e k When i delay rises to l + 1, some packet from S has lag l (since crosses e j instead of i). Consider last time t where a lag-l packet exists on path 7

some lag-l packet w crosses e j at t (others increase to lag- (l + 1)) charge one delay to w. w leaves at this point (if not, then l at e j +1 next time) Summary: 2 key roles for chernoff sampling load balancing high probability results at log n means. 8