The first bound is the strongest, the other two bounds are often easier to state and compute. Proof: Applying Markov's inequality, for any >0 we have

The first bound is the strongest, the other two bounds are often easier to state and compute Proof: Applying Markov's inequality, for any >0 we have Pr (1 + ) = Pr For any >0, we can set = ln 1+ (4.4.1): Pr (1+ )< (1+) ) >0to get. MAT-72306 RandAl, Spring 2015 5-Feb-15 170 For (4.4.2) we need to show that, for 0 < 1, (1 + ) ). Taking the logarithm of both sides, we obtain the equivalent condition 1 + ln 1+ 30 Computing the derivatives of, we have: )=1 1+ ln 1+ +23 1+ ln 1+ +23 1 1+ +2 3 MAT-72306 RandAl, Spring 2015 5-Feb-15 171 1

We see that < 0for <1/2 and >0 for 12 Hence first decreases and then increases over the interval [0,1] Since (0)=0and 1 <0, we can conclude that in the interval [0,1] Since (0)=0, it follows that 0in that interval, proving (4.4.2). MAT-72306 RandAl, Spring 2015 5-Feb-15 172 To prove (4.4.3), let = (1 +. Then, for, 5. Hence, using (4.4.1), Pr (1+ 1+ (1 + ) ) 6 2 MAT-72306 RandAl, Spring 2015 5-Feb-15 173 2

We obtain similar results bounding the deviation below the mean Theorem 4.5: Let,, be independent Poisson trials s.t. Pr =1 =. Let = and ]. Then for 0<<1: Pr (1 (1 + ) ) Pr (1. Again, the first bound is stronger, but the latter is generally easier to use and sufficient in most applications ; MAT-72306 RandAl, Spring 2015 5-Feb-15 174 Often the following form of the Chernoff bound is used Corollary 4.6: Let,, be independent Poisson trials s.t. Pr =1 =. Let = and ]. For 0<<1: Pr( In practice we often do not have the exact value of ] Instead we can use in Theorem 4.4 and []in Theorem 4.5. MAT-72306 RandAl, Spring 2015 5-Feb-15 175 3

4.2.2. Example: Coin Flips Let be the number of heads in a sequence of independent fair coin flips Applying the Chernoff bound (4.6), we have Pr 2 1 2 ln 2 exp 1 3 2 = 2 6ln Thus, the concentration around the mean /2 is very tight; most of the time, the deviations from the mean are of the order of ( ln ) MAT-72306 RandAl, Spring 2015 5-Feb-15 176 Consider the Pr of /4 or /4 heads in a sequence of independent fair coin flips Chebyshev s inequality showed that Pr 2 4 Already, this is worse than the Chernoff bound just calculated for a significantly larger event! Using the Chernoff bound, we find that Pr 2 4 2 exp 1 1 = 2 3 2 4 Thus, Chernoff's technique gives a bound that is exponentially smaller than that obtained using Chebyshev's inequality MAT-72306 RandAl, Spring 2015 5-Feb-15 177 4

4.2.3. Application: Estimating a Parameter Evaluate the probability that a particular gene mutation occurs in the population An expensive lab test determines if a DNA sample carries the mutation We would like to obtain a relatively reliable estimate from a small number of samples Let be the unknown value that we are trying to estimate Assume that we have samples and that = of these samples have the mutation MAT-72306 RandAl, Spring 2015 5-Feb-15 178 Given a sufficiently large number of samples, we expect to be close to the sampled value Definition 4.2: A confidence interval for a parameter is an interval ], s.t. Pr( Instead of predicting a single value for the parameter, we give an interval that is likely to contain the parameter If can take on any real value, it may not make sense to try to pin down its exact value from a finite sample, but it does make sense to estimate it within some small range MAT-72306 RandAl, Spring 2015 5-Feb-15 179 5

We want both the interval size and the error probability to be as small as possible We derive a trade-off between these two parameters and the number of samples In particular, given that among samples (chosen uniformly at random from the entire population) we find the mutation in exactly samples, we need to find values of and for which Pr( ) = Pr( ), )) MAT-72306 RandAl, Spring 2015 5-Feb-15 180 Now ), so ] = If then we have one of the following two events: 1. if, then = ](1 + ) ; 2. if, then = ](1 ) We can apply the Chernoff bounds of Thms 4.4 and 4.5 to compute Pr = Pr + Pr 1+ MAT-72306 RandAl, Spring 2015 5-Feb-15 181 6

This bound is not useful because the value of is unknown A simple solution is to use the fact that 1, yielding Pr Setting, we obtain a tradeoff between,, and the error probability MAT-72306 RandAl, Spring 2015 5-Feb-15 182 4.3. Better Bounds for Some Special Cases We can obtain stronger bounds using a simpler proof technique for some special cases of symmetric RVs Theorem 4.7: Let,, be independent RVs with Let = Pr = 1 = Pr 1 = 1 2.. For any >0, Pr =. MAT-72306 RandAl, Spring 2015 5-Feb-15 183 7

Proof: For any > 0, To estimate and = 1 2 + 1 2., we observe that =1++! +! =1+ + 1!! using the Taylor series expansion for. Thus, =! 2! MAT-72306 RandAl, Spring 2015 5-Feb-15 184 Using this estimate yields and = Pr = Pr Setting, we obtain Pr By symmetry we also have Pr MAT-72306 RandAl, Spring 2015 5-Feb-15 185 8

Corollary 4.8: Let,, be independent RVs, Let = Pr = 1 = Pr 1 = 1 2.. For any >0, Pr = 2. Apply transformation = ( + 1)/2 to prove Corollary 4.9: Let,, be independent RVs, Pr = 1 = Pr = 0 = 1 2. Let = and 2. 1. For any >0,Pr. 2. For any > 0, Pr (1 + ) =. MAT-72306 RandAl, Spring 2015 5-Feb-15 186 4.4. Application: Set Balancing Given an matrix with entries in {0,1}, let = We are looking for a vector with entries in 1,1} that minimizes = max,, MAT-72306 RandAl, Spring 2015 5-Feb-15 187 9

This problem arises in designing statistical experiments Each column of the matrix represents a subject in the experiment and each row represents a feature The vector partitions the subjects into two disjoint groups, so that each feature is roughly as balanced as possible between the two groups One of the groups serves as a control group for an experiment that is run on the other group MAT-72306 RandAl, Spring 2015 5-Feb-15 188 We randomly choose the entries of, with Pr = 1 = Pr 1 = 1 2 The choices for different entries are independent Surprisingly, although this algorithm ignores the entries of the matrix, is likely to be only ln This bound is fairly tight: When, there exists a matrix for which is for any choice of MAT-72306 RandAl, Spring 2015 5-Feb-15 189 10

Theorem 4.11: For a random vector with entries chosen independently and with equal probability from the set 1,1}, Pr ln 2. Proof: Consider the th row,,, and let be the number of 1s in that row. If ln, then clearly = ln. On the other hand, if > ln then we note that the nonzero terms in the sum MAT-72306 RandAl, Spring 2015 5-Feb-15 190 = are independent random variables, each with probability 1/2 of being either +1 or 1. Now using the Chernoff bound of Corollary 4.8 and the fact that, Pr > ln 2. By the union bound, the probability that the bound fails for any row is at most 2/. MAT-72306 RandAl, Spring 2015 5-Feb-15 191 11

5 Balls, Bins, and Random Graphs Let us throw balls randomly into bins, each ball lands in a bin chosen independently and uniformly at random (I+U@R) We use the techniques we have developed previously to analyze this process and develop a new approach based on what is known as the Poisson approximation MAT-72306 RandAl, Spring 2015 5-Feb-15 192 5.1. Example: The Birthday Paradox Is it more likely that some two people in the room share the same birthday or that no two people in the room share the same birthday? We assume that the birthday of each person is a random day from a 365-day year, each chosen I+U@R We assume that a person's birthday is equally likely to be any day of the year, we avoid leap years, and we ignore the possibility of twins MAT-72306 RandAl, Spring 2015 5-Feb-15 193 12

Let there be 30 people Thirty days must be chosen from the 365; there ways to do this are These 30 days can be assigned to the people in any of the 30! possible orders Hence there are 30! configurations where no two people share the same birthday, out of the 365 ways the birthdays could occur Thus, the probability is 30! 365 MAT-72306 RandAl, Spring 2015 5-Feb-15 194 We can also consider one person at a time The first person has a birthday The probability that the second person has a different birthday is (1 1/365) The probability that the third person then has a birthday different from the first two, given that they have different birthdays, is (1 2/365) Continuing on, the probability that the th person has a different birthday than the first 1, assuming that they have different birthdays, is (1 ( 1)/365) MAT-72306 RandAl, Spring 2015 5-Feb-15 195 13

So the probability that 30 people all have different birthdays is the product of these terms: 1 365 2 365 29 365 This product is 0.2937, so when 30 people are in the room there is more than a 70% chance that two share the same birthday A similar calculation shows that only 23 people need to be in the room before it is more likely than not that two people share a birthday MAT-72306 RandAl, Spring 2015 5-Feb-15 196 More generally, if there are people and possible birthdays then the probability that all have different birthdays is Using that when is small compared to, we see that if is small compared to then = exp MAT-72306 RandAl, Spring 2015 5-Feb-15 197 14

Hence the value for at which the probability that people all have different birthdays is 1/2 is approximately given by the equation =ln2, or = ln 2 For = 365, this approximation gives = 22.49, matching the exact calculation quite well Mars has = 687 days, need = 30.86 aliens MAT-72306 RandAl, Spring 2015 5-Feb-15 198 The following simple arguments give loose bounds and good intuition Let us consider each person one at a time, and let be the event that the th person's birthday does not match any of the birthdays of the first 1 people Then the probability that the first people fail to have distinct birthdays is Pr Pr 1 1) If this Pr is < 1/2, so with people the Pr is 1/2 that all birthdays will be distinct MAT-72306 RandAl, Spring 2015 5-Feb-15 199 15

Now assume that the first distinct birthdays people all have Each person after that has probability at least =1 of having the same birthday as one of these first people Hence the Pr that the next people all have different birthdays than the first 1 < 1 <1 2 Hence, once there are 2 people, the Pr is at most 1/ that all birthdays will be distinct MAT-72306 RandAl, Spring 2015 5-Feb-15 200 5.2. Balls into Bins balls are thrown into bins, with the location of each ball chosen I+U@R from the possibilities The question behind the birthday paradox is whether or not there is a bin with two balls How many of the bins are empty? How many balls are in the fullest bin? Many of these questions have applications to the design and analysis of algorithms MAT-72306 RandAl, Spring 2015 5-Feb-15 201 16

Birthday paradox: place balls randomly into bins then, for some, at least one of the bins is likely to have more than one ball in it Another interesting question concerns the max number of balls in a bin, or the maximum load Let us consider the case where, so that the number of balls equals the number of bins and the average load is 1 Of course the maximum load is, but it is very unlikely that all balls land in the same bin We seek an upper bound that holds with probability tending to 1 as grows large MAT-72306 RandAl, Spring 2015 5-Feb-15 202 We can show that the maximum load is not more than 3ln/ ln ln with probability at most for sufficiently large via a direct calculation and a union bound This is a very loose bound; although the maximum load is in fact (ln/ ln ln ) with probability close to 1, the constant factor 3 is chosen to simplify the argument and could be reduced with more care Lemma 5.1: When balls are thrown I+U@R into bins, the probability that the maximum load is more than 3ln/ ln ln is at most for sufficiently large MAT-72306 RandAl, Spring 2015 5-Feb-15 203 17

Proof: The probability that bin 1 receives at least balls is at most This follows from a union bound; there are distinct sets of balls, and for any set of balls the probability that all land in bin 1 is. 1 We now use the inequalities 1. 1! MAT-72306 RandAl, Spring 2015 5-Feb-15 204 The second inequality is a consequence of the following general bound on factorials: since we have! <!>! Applying a union bound again allows us to find that, for 3ln/lnln, the probability that any bin receives at least balls is bounded above by lnln 3ln MAT-72306 RandAl, Spring 2015 5-Feb-15 205 18

lnln ln 1 for sufficiently large. MAT-72306 RandAl, Spring 2015 5-Feb-15 206 19