The first bound is the strongest, the other two bounds are often easier to state and compute. Proof: Applying Markov's inequality, for any >0 we have

Similar documents
Application: Bucket Sort

Randomized Algorithms

Lecture 04: Balls and Bins: Birthday Paradox. Birthday Paradox

Chernoff Bounds. Theme: try to show that it is unlikely a random variable X is far away from its expectation.

Tail Inequalities. The Chernoff bound works for random variables that are a sum of indicator variables with the same distribution (Bernoulli trials).

Assignment 4: Solutions

Balls & Bins. Balls into Bins. Revisit Birthday Paradox. Load. SCONE Lab. Put m balls into n bins uniformly at random

Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( )

Kousha Etessami. U. of Edinburgh, UK. Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 1 / 13

Randomized algorithm

Lecture 4: Two-point Sampling, Coupon Collector s problem

Randomized Load Balancing:The Power of 2 Choices

Recitation 6. Randomization. 6.1 Announcements. RandomLab has been released, and is due Monday, October 2. It s worth 100 points.

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

CS5314 Randomized Algorithms. Lecture 15: Balls, Bins, Random Graphs (Hashing)

Homework 4 Solutions

11.1 Set Cover ILP formulation of set cover Deterministic rounding

6.042/18.062J Mathematics for Computer Science. Independence

Solutions to Problem Set 4

1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10% (these are called false negatives ).

Expectation, inequalities and laws of large numbers

:s ej2mttlm-(iii+j2mlnm )(J21nm/m-lnm/m)

Discrete Mathematics

CSE 525 Randomized Algorithms & Probabilistic Analysis Spring Lecture 3: April 9

In a five-minute period, you get a certain number m of requests. Each needs to be served from one of your n servers.

CPSC 536N: Randomized Algorithms Term 2. Lecture 2

With high probability

CSE 548: Analysis of Algorithms. Lectures 18, 19, 20 & 21 ( Randomized Algorithms & High Probability Bounds )

Lecture Notes. Here are some handy facts about the probability of various combinations of sets:

Lecture 2: Review of Basic Probability Theory

P (A B) P ((B C) A) P (B A) = P (B A) + P (C A) P (A) = P (B A) + P (C A) = Q(A) + Q(B).

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20

As mentioned, we will relax the conditions of our dictionary data structure. The relaxations we

PRACTICE PROBLEMS FOR EXAM 2

Outline. Martingales. Piotr Wojciechowski 1. 1 Lane Department of Computer Science and Electrical Engineering West Virginia University.

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

Some Basic Concepts of Probability and Information Theory: Pt. 1

Chapter 35 out of 37 from Discrete Mathematics for Neophytes: Number Theory, Probability, Algorithms, and Other Stuff by J. M. Cargal.

Testing Problems with Sub-Learning Sample Complexity

6.1 Occupancy Problem

Tail Inequalities Randomized Algorithms. Sariel Har-Peled. December 20, 2002

Sample Spaces, Random Variables

4. Suppose that we roll two die and let X be equal to the maximum of the two rolls. Find P (X {1, 3, 5}) and draw the PMF for X.

Module 8 Probability

6.262: Discrete Stochastic Processes 2/2/11. Lecture 1: Introduction and Probability review

X = X X n, + X 2

6.042/18.062J Mathematics for Computer Science November 21, 2006 Tom Leighton and Ronitt Rubinfeld. Independence

Lecture 4. P r[x > ce[x]] 1/c. = ap r[x = a] + a>ce[x] P r[x = a]

HW2 Solutions, for MATH441, STAT461, STAT561, due September 9th

Probability reminders

Impagliazzo s Hardcore Lemma

Problem 1: (Chernoff Bounds via Negative Dependence - from MU Ex 5.15)

Midterm Exam 2 (Solutions)

Lecture 18: March 15

BLAST: Target frequencies and information content Dannie Durand

Notes 6 : First and second moment methods

Statistics 100A Homework 5 Solutions

CS155: Probability and Computing: Randomized Algorithms and Probabilistic Analysis

Topic 2 Multiple events, conditioning, and independence, I. 2.1 Two or more events on the same sample space

CSE525: Randomized Algorithms and Probabilistic Analysis April 2, Lecture 1

Probabilistic Systems Analysis Spring 2018 Lecture 6. Random Variables: Probability Mass Function and Expectation

Module 3. Function of a Random Variable and its distribution

Lecture 18: Shanon s Channel Coding Theorem. Lecture 18: Shanon s Channel Coding Theorem

Math 493 Final Exam December 01

MAT2377. Ali Karimnezhad. Version September 9, Ali Karimnezhad

Lecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability

Pr[A B] > Pr[A]Pr[B]. Pr[A B C] = Pr[(A B) C] = Pr[A]Pr[B A]Pr[C A B].

Mathematical Foundations of Computer Science Lecture Outline October 9, 2018

Title. Birthday Problem. Yukang Shen. University of California, Santa Barbara. May 7, 2014

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora

Discrete Mathematics and Probability Theory Summer 2017 Hongling Lu, Vrettos Moulos, and Allen Tang HW 6

Randomized Algorithms Week 2: Tail Inequalities

Module 1. Probability

Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds

1 Review of The Learning Setting

The Probabilistic Method

compare to comparison and pointer based sorting, binary trees

CS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 32

ACMS Statistics for Life Sciences. Chapter 9: Introducing Probability

Expected Value II. 1 The Expected Number of Events that Happen

UCSD CSE 21, Spring 2014 [Section B00] Mathematics for Algorithm and System Analysis

CS 361: Probability & Statistics

Exponential Tail Bounds

Lecture 4: Sampling, Tail Inequalities

Conditional Probability

CS261: A Second Course in Algorithms Lecture #18: Five Essential Tools for the Analysis of Randomized Algorithms

Mathematical Foundations of Computer Science Lecture Outline October 18, 2018

EECS 126 Probability and Random Processes University of California, Berkeley: Spring 2017 Kannan Ramchandran March 21, 2017.

Introduction to Probability Theory, Algebra, and Set Theory

Math 243 Section 3.1 Introduction to Probability Lab

Essentials on the Analysis of Randomized Algorithms

Lecture 8: Probability

Lecture 14: October 22

COS597D: Information Theory in Computer Science September 21, Lecture 2

STAT 200C: High-dimensional Statistics

Lab 5: Two-Dimensional Motion. To understand the independence of motion in the x- and y- directions

Lectures on Elementary Probability. William G. Faris

ECE353: Probability and Random Processes. Lecture 2 - Set Theory

Lecture 1. ABC of Probability

CLASS NOTES Models, Algorithms and Data: Introduction to computing 2018

Transcription:

The first bound is the strongest, the other two bounds are often easier to state and compute Proof: Applying Markov's inequality, for any >0 we have Pr (1 + ) = Pr For any >0, we can set = ln 1+ (4.4.1): Pr (1+ )< (1+) ) >0to get. MAT-72306 RandAl, Spring 2015 5-Feb-15 170 For (4.4.2) we need to show that, for 0 < 1, (1 + ) ). Taking the logarithm of both sides, we obtain the equivalent condition 1 + ln 1+ 30 Computing the derivatives of, we have: )=1 1+ ln 1+ +23 1+ ln 1+ +23 1 1+ +2 3 MAT-72306 RandAl, Spring 2015 5-Feb-15 171 1

We see that < 0for <1/2 and >0 for 12 Hence first decreases and then increases over the interval [0,1] Since (0)=0and 1 <0, we can conclude that in the interval [0,1] Since (0)=0, it follows that 0in that interval, proving (4.4.2). MAT-72306 RandAl, Spring 2015 5-Feb-15 172 To prove (4.4.3), let = (1 +. Then, for, 5. Hence, using (4.4.1), Pr (1+ 1+ (1 + ) ) 6 2 MAT-72306 RandAl, Spring 2015 5-Feb-15 173 2

We obtain similar results bounding the deviation below the mean Theorem 4.5: Let,, be independent Poisson trials s.t. Pr =1 =. Let = and ]. Then for 0<<1: Pr (1 (1 + ) ) Pr (1. Again, the first bound is stronger, but the latter is generally easier to use and sufficient in most applications ; MAT-72306 RandAl, Spring 2015 5-Feb-15 174 Often the following form of the Chernoff bound is used Corollary 4.6: Let,, be independent Poisson trials s.t. Pr =1 =. Let = and ]. For 0<<1: Pr( In practice we often do not have the exact value of ] Instead we can use in Theorem 4.4 and []in Theorem 4.5. MAT-72306 RandAl, Spring 2015 5-Feb-15 175 3

4.2.2. Example: Coin Flips Let be the number of heads in a sequence of independent fair coin flips Applying the Chernoff bound (4.6), we have Pr 2 1 2 ln 2 exp 1 3 2 = 2 6ln Thus, the concentration around the mean /2 is very tight; most of the time, the deviations from the mean are of the order of ( ln ) MAT-72306 RandAl, Spring 2015 5-Feb-15 176 Consider the Pr of /4 or /4 heads in a sequence of independent fair coin flips Chebyshev s inequality showed that Pr 2 4 Already, this is worse than the Chernoff bound just calculated for a significantly larger event! Using the Chernoff bound, we find that Pr 2 4 2 exp 1 1 = 2 3 2 4 Thus, Chernoff's technique gives a bound that is exponentially smaller than that obtained using Chebyshev's inequality MAT-72306 RandAl, Spring 2015 5-Feb-15 177 4

4.2.3. Application: Estimating a Parameter Evaluate the probability that a particular gene mutation occurs in the population An expensive lab test determines if a DNA sample carries the mutation We would like to obtain a relatively reliable estimate from a small number of samples Let be the unknown value that we are trying to estimate Assume that we have samples and that = of these samples have the mutation MAT-72306 RandAl, Spring 2015 5-Feb-15 178 Given a sufficiently large number of samples, we expect to be close to the sampled value Definition 4.2: A confidence interval for a parameter is an interval ], s.t. Pr( Instead of predicting a single value for the parameter, we give an interval that is likely to contain the parameter If can take on any real value, it may not make sense to try to pin down its exact value from a finite sample, but it does make sense to estimate it within some small range MAT-72306 RandAl, Spring 2015 5-Feb-15 179 5

We want both the interval size and the error probability to be as small as possible We derive a trade-off between these two parameters and the number of samples In particular, given that among samples (chosen uniformly at random from the entire population) we find the mutation in exactly samples, we need to find values of and for which Pr( ) = Pr( ), )) MAT-72306 RandAl, Spring 2015 5-Feb-15 180 Now ), so ] = If then we have one of the following two events: 1. if, then = ](1 + ) ; 2. if, then = ](1 ) We can apply the Chernoff bounds of Thms 4.4 and 4.5 to compute Pr = Pr + Pr 1+ MAT-72306 RandAl, Spring 2015 5-Feb-15 181 6

This bound is not useful because the value of is unknown A simple solution is to use the fact that 1, yielding Pr Setting, we obtain a tradeoff between,, and the error probability MAT-72306 RandAl, Spring 2015 5-Feb-15 182 4.3. Better Bounds for Some Special Cases We can obtain stronger bounds using a simpler proof technique for some special cases of symmetric RVs Theorem 4.7: Let,, be independent RVs with Let = Pr = 1 = Pr 1 = 1 2.. For any >0, Pr =. MAT-72306 RandAl, Spring 2015 5-Feb-15 183 7

Proof: For any > 0, To estimate and = 1 2 + 1 2., we observe that =1++! +! =1+ + 1!! using the Taylor series expansion for. Thus, =! 2! MAT-72306 RandAl, Spring 2015 5-Feb-15 184 Using this estimate yields and = Pr = Pr Setting, we obtain Pr By symmetry we also have Pr MAT-72306 RandAl, Spring 2015 5-Feb-15 185 8

Corollary 4.8: Let,, be independent RVs, Let = Pr = 1 = Pr 1 = 1 2.. For any >0, Pr = 2. Apply transformation = ( + 1)/2 to prove Corollary 4.9: Let,, be independent RVs, Pr = 1 = Pr = 0 = 1 2. Let = and 2. 1. For any >0,Pr. 2. For any > 0, Pr (1 + ) =. MAT-72306 RandAl, Spring 2015 5-Feb-15 186 4.4. Application: Set Balancing Given an matrix with entries in {0,1}, let = We are looking for a vector with entries in 1,1} that minimizes = max,, MAT-72306 RandAl, Spring 2015 5-Feb-15 187 9

This problem arises in designing statistical experiments Each column of the matrix represents a subject in the experiment and each row represents a feature The vector partitions the subjects into two disjoint groups, so that each feature is roughly as balanced as possible between the two groups One of the groups serves as a control group for an experiment that is run on the other group MAT-72306 RandAl, Spring 2015 5-Feb-15 188 We randomly choose the entries of, with Pr = 1 = Pr 1 = 1 2 The choices for different entries are independent Surprisingly, although this algorithm ignores the entries of the matrix, is likely to be only ln This bound is fairly tight: When, there exists a matrix for which is for any choice of MAT-72306 RandAl, Spring 2015 5-Feb-15 189 10

Theorem 4.11: For a random vector with entries chosen independently and with equal probability from the set 1,1}, Pr ln 2. Proof: Consider the th row,,, and let be the number of 1s in that row. If ln, then clearly = ln. On the other hand, if > ln then we note that the nonzero terms in the sum MAT-72306 RandAl, Spring 2015 5-Feb-15 190 = are independent random variables, each with probability 1/2 of being either +1 or 1. Now using the Chernoff bound of Corollary 4.8 and the fact that, Pr > ln 2. By the union bound, the probability that the bound fails for any row is at most 2/. MAT-72306 RandAl, Spring 2015 5-Feb-15 191 11

5 Balls, Bins, and Random Graphs Let us throw balls randomly into bins, each ball lands in a bin chosen independently and uniformly at random (I+U@R) We use the techniques we have developed previously to analyze this process and develop a new approach based on what is known as the Poisson approximation MAT-72306 RandAl, Spring 2015 5-Feb-15 192 5.1. Example: The Birthday Paradox Is it more likely that some two people in the room share the same birthday or that no two people in the room share the same birthday? We assume that the birthday of each person is a random day from a 365-day year, each chosen I+U@R We assume that a person's birthday is equally likely to be any day of the year, we avoid leap years, and we ignore the possibility of twins MAT-72306 RandAl, Spring 2015 5-Feb-15 193 12

Let there be 30 people Thirty days must be chosen from the 365; there ways to do this are These 30 days can be assigned to the people in any of the 30! possible orders Hence there are 30! configurations where no two people share the same birthday, out of the 365 ways the birthdays could occur Thus, the probability is 30! 365 MAT-72306 RandAl, Spring 2015 5-Feb-15 194 We can also consider one person at a time The first person has a birthday The probability that the second person has a different birthday is (1 1/365) The probability that the third person then has a birthday different from the first two, given that they have different birthdays, is (1 2/365) Continuing on, the probability that the th person has a different birthday than the first 1, assuming that they have different birthdays, is (1 ( 1)/365) MAT-72306 RandAl, Spring 2015 5-Feb-15 195 13

So the probability that 30 people all have different birthdays is the product of these terms: 1 365 2 365 29 365 This product is 0.2937, so when 30 people are in the room there is more than a 70% chance that two share the same birthday A similar calculation shows that only 23 people need to be in the room before it is more likely than not that two people share a birthday MAT-72306 RandAl, Spring 2015 5-Feb-15 196 More generally, if there are people and possible birthdays then the probability that all have different birthdays is Using that when is small compared to, we see that if is small compared to then = exp MAT-72306 RandAl, Spring 2015 5-Feb-15 197 14

Hence the value for at which the probability that people all have different birthdays is 1/2 is approximately given by the equation =ln2, or = ln 2 For = 365, this approximation gives = 22.49, matching the exact calculation quite well Mars has = 687 days, need = 30.86 aliens MAT-72306 RandAl, Spring 2015 5-Feb-15 198 The following simple arguments give loose bounds and good intuition Let us consider each person one at a time, and let be the event that the th person's birthday does not match any of the birthdays of the first 1 people Then the probability that the first people fail to have distinct birthdays is Pr Pr 1 1) If this Pr is < 1/2, so with people the Pr is 1/2 that all birthdays will be distinct MAT-72306 RandAl, Spring 2015 5-Feb-15 199 15

Now assume that the first distinct birthdays people all have Each person after that has probability at least =1 of having the same birthday as one of these first people Hence the Pr that the next people all have different birthdays than the first 1 < 1 <1 2 Hence, once there are 2 people, the Pr is at most 1/ that all birthdays will be distinct MAT-72306 RandAl, Spring 2015 5-Feb-15 200 5.2. Balls into Bins balls are thrown into bins, with the location of each ball chosen I+U@R from the possibilities The question behind the birthday paradox is whether or not there is a bin with two balls How many of the bins are empty? How many balls are in the fullest bin? Many of these questions have applications to the design and analysis of algorithms MAT-72306 RandAl, Spring 2015 5-Feb-15 201 16

Birthday paradox: place balls randomly into bins then, for some, at least one of the bins is likely to have more than one ball in it Another interesting question concerns the max number of balls in a bin, or the maximum load Let us consider the case where, so that the number of balls equals the number of bins and the average load is 1 Of course the maximum load is, but it is very unlikely that all balls land in the same bin We seek an upper bound that holds with probability tending to 1 as grows large MAT-72306 RandAl, Spring 2015 5-Feb-15 202 We can show that the maximum load is not more than 3ln/ ln ln with probability at most for sufficiently large via a direct calculation and a union bound This is a very loose bound; although the maximum load is in fact (ln/ ln ln ) with probability close to 1, the constant factor 3 is chosen to simplify the argument and could be reduced with more care Lemma 5.1: When balls are thrown I+U@R into bins, the probability that the maximum load is more than 3ln/ ln ln is at most for sufficiently large MAT-72306 RandAl, Spring 2015 5-Feb-15 203 17

Proof: The probability that bin 1 receives at least balls is at most This follows from a union bound; there are distinct sets of balls, and for any set of balls the probability that all land in bin 1 is. 1 We now use the inequalities 1. 1! MAT-72306 RandAl, Spring 2015 5-Feb-15 204 The second inequality is a consequence of the following general bound on factorials: since we have! <!>! Applying a union bound again allows us to find that, for 3ln/lnln, the probability that any bin receives at least balls is bounded above by lnln 3ln MAT-72306 RandAl, Spring 2015 5-Feb-15 205 18

lnln ln 1 for sufficiently large. MAT-72306 RandAl, Spring 2015 5-Feb-15 206 19